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Praise for the first edition 


“Quantum field theory is an extraordinarily beautiful subject, but it can be an intimidating 
one. The profound and deeply physical concepts it embodies can get lost, to the beginner, 
amidst its technicalities. In this book, Zee imparts the wisdom of an experienced and 
remarkably creative practitioner in a user-friendly style. I wish something like it had been 
available when I was a student.” 

—Frank Wilczek, Massachusetts Institute of Technology 

“Finally! Zee has written a ground-breaking quantum field theory text based on the course 
I made him teach when I chaired the Princeton physics department. With utmost clarity 
he gives the eager student a light-hearted and easy-going introduction to the multifaceted 
wonders of quantum field theory. I wish I had this book when I taught the subject.” 

—Marvin L. Goldberger, President, Emeritus, California Institute of Technology 

“This book is filled with charming explanations that students will find beneficial.” 

—Ed Witten, Institute for Advanced Study 

“This book is perhaps the most user-friendly introductory text to the essentials of quantum 
field theory and its many modern applications. With his physically intuitive approach, 
Professor Zee makes a serious topic more reachable for beginners, reducing the conceptual 
barrier while preserving enough mathematical details necessary for a firm grasp of the 
subject.” 

—Bei Lok Hu, University of Maryland 

“Like the famous Feynman Lectures on Physics, this book has the flavor of a good 
blackboard lecture. Zee presents technical details, but only insofar as they serve the larger 
purpose of giving insight into quantum field theory and bringing out its beauty.” 

—Stephen M. Barr, University of Delaware 

“This is a fantastic book—exciting, amusing, unique, and very valuable.” 

—Clifford V. Johnson, University of Durham 

“Tony Zee explains quantum field theory with a clear and engaging style. For budding or 
seasoned condensed matter physicists alike, he shows us that field theory is a nourishing 
nut to be cracked and savored.” 

—Matthew P. A. Fisher, Kavli Institute for Theoretical Physics 

“I was so engrossed that I spent all of Saturday and Sunday one weekend absorbing half 
the book, to my wife’s dismay. Zee has a talent for explaining the most abstruse and 
arcane concepts in an elegant way, using the minimum number of equations (the jokes 
and anecdotes help). ... I wish this were available when I was a graduate student. Buy 
the book, keep it by your bed, and relish the insights delivered with such flair and grace.” 
—N. P. Ong, Princeton University 



What readers are saying 


“Funny, chatty, physical: QFT education transformed!! This text stands apart from others 
in so many ways that it’s difficult to list them all. . . . The exposition is breezy and chatty. 
The text is never boring to read, and is at times very, very funny. Puns and jokes abound, 
as do anecdotes. ... A book which is much easier, and more fun, to read than any of the 
others. Zee’s skills as a popular physics writer have been used to excellent effect in writing 
this textbook. . . . Wholeheartedly recommended.” 

—M. Haque 

“A readable, and rereadable instant classic on QFT. ... At an introductory level, this type 
of book—with its pedagogical (and often very funny) narrative—is priceless. [It] is full 
of fantastic insights akin to reading the Feynman lectures. I have since used QFT in a 
Nutshell as a review for [my] year-long course covering all of Peslcin and Schroder, and 
have been pleasantly surprised at how Zee is able to preemptively answer many of the 
open questions that eluded me during my course. ... I value QFT in a Nutshell the same 
way I do the Feynman lectures. . . . It’s a text to teach an understanding of physics.” 

—Flip Tanedo 

“One of those books a person interested in theoretical physics simply must own! A real 
scientific masterpiece. I bought it at the time I was a physics sophomore and that was the 
best choice I could have made. It was this book that triggered my interest in quantum field 
theory and crystallized my dreams of becoming a theoretical physicist. . . . The main goal 
of the book is to make the reader gain real intuition in the field. Amazing . . . amusing . . . 
real fun. What also distinguishes this book from others dealing with a similar subject 
is that it is written like a tale. ... I feel enormously fortunate to have come across this 
book at the beginning of my adventure with theoretical physics. . . . Definitely the best 
quantum field theory book I have ever read.” 

—Anonymous 

“I have used Quantum Field Theory in a Nutshell as the primary text. ... I am immensely 
pleased with the book, and recommend it highly. . . . Don’t let the ‘damn the torpedoes, 
full steam ahead’ approach scare you off. Once you get used to seeing the physics quickly, 
I think you will find the experience very satisfying intellectually.” 

—Jim Napolitano 

“This is undoubtedly the best book I have ever read about the subject. Zee does a fantastic 
job of explaining quantum field theory, in a way I have never seen before, and I have 
read most of the other books on this topic. If you are looking for quantum field theory 
explanations that are clear, precise, concise, intuitive, and fun to read—this is the book 
for you.” 

—Anonymous 



“One of the most artistic and deepest books ever written on quantum field theory. 
Amazing . . . extremely pleasant ... a lot of very deep and illuminating remarks. ... I 
recommend the book by Zee to everybody who wants to get a clear idea what good physics 
is about.” 

—Slava Mukhanov 

“Perfect for learning field theory on your own—by far the clearest and easiest to follow 
book I’ve found on the subject.” 

—Ian Z. Lovejoy 

“A beautifully written introduction to the modern view of fields . . . breezy and 
enchanting, leading to exceptional clarity without sacrificing depth, breadth, or rigor 
of content. . . . [It] passes my test of true greatness: I wish it had been the first book on 
this topic that I had found.” 

—Jeffrey D. Scargle 

“A breeze of fresh air ... a real literary gem which will be useful for students who make 
their first steps in this difficult subject and an enjoyable treat for experts, who will find 
new and deep insights. Indeed, the Nutshell is like a bright light source shining among 
tall and heavy trees—the many more formal books that exist—and helps seeing the forest 
as a whole! ... I have been practicing QFT during the past two decades and with all my 
experience I was thrilled with enjoyment when I read some of the sections.” 

—Joshua Feinberg 

“This text not only teaches up-to-date quantum field theory, but also tells readers how 
research is actually done and shows them how to think about physics. [It teaches things 
that] people usually say ‘cannot be learned from books.’ [It is] in the same style as Fearful 
Symmetry and Einstein’s Universe. All three books . . . are classics.” 

—Yu Shi 

“I belong to the [group of] enthusiastic laymen having enough curiosity and insistence . . . 
but lacking the mastery of advanced math and physics. . . . I really could not see the forest 
for the trees. But at long last I got this book!” 

—Malcay Attila 

“More fun than any other QFT book I have read. The comparisons to Feynman’s 
writings made by several of the reviewers seem quite apt. . . . His enthusiasm is quite 
infectious. ... I doubt that any other book will spark your interest like this one does.” 

—Stephen Wandzura 

“I’m having a blast reading this book. It’s both deep and entertaining; this is a rare breed, 
indeed. I usually prefer the more formal style (big Landau fan), but I have to say that when 
Zee has the talent to present things his way, it’s a definite plus.” 

—Pierre Jouvelot 



“Required reading for QFT: [it] heralds the introduction of a book on quantum field theory 
that you can sit down and read. My professor’s lectures made much more sense as I 
followed along in this book, because concepts were actually EXPLAINED, not just worked 
out.” 

—Alexander Scott 

“Not your father’s quantum field theory text: I particularly appreciate that things are 
motivated physically before their mathematical articulation. . . . Most especially though, 
the author’s ‘heuristic’ descriptions are the best I have read anywhere. From them alone 
the essential ideas become crystal clear.” 

—Dan Dill 
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Preface to the First Edition 


As a student, I was rearing at the bit, after a course on quantum mechanics, to learn 
quantum field theory, but the books on the subject all seemed so formidable. Fortunately, 
I came across a little book by Mandl on field theory, which gave me a taste of the subject 
enabling me to go on and tackle the more substantive texts. I have since learned that other 
physicists of my generation had similar good experiences with Mandl. 

In the last three decades or so, quantum field theory has veritably exploded and Mandl 
would be hopelessly out of date to recommend to a student now. Thus I thought of writing 
a book on the essentials of modern quantum field theory addressed to the bright and eager 
student who has just completed a course on quantum mechanics and who is impatient to 
start tackling quantum field theory. 

I envisaged a relatively thin book, thin at least in comparison with the many weighty 
tomes on the subject. I envisaged the style to be breezy and colloquial, and the choice 
of topics to be idiosyncratic, certainly not encyclopedic. I envisaged having many short 
chapters, keeping each chapter “bite-sized.” 

The challenge in writing this book is to keep it thin and accessible while at the same 
time introducing as many modern topics as possible. A tough balancing act! In the end, 
I had to be unrepentantly idiosyncratic in what I chose to cover. Note to the prospective 
book reviewer: You can always criticize the book for leaving out your favorite topics. I do 
not apologize in any way, shape, or form. My motto in this regard (and in life as well), 
taken from the Ricky Nelson song “Garden Party,” is “You can’t please everyone so you 
gotta please yourself.” 

This book differs from other quantum field theory books that have come out in recent 
years in several respects. 

I want to get across the important point that the usefulness of quantum field theory is far 
from limited to high energy physics, a misleading impression my generation of theoretical 
physicists were inculcated with and which amazingly enough some recent textbooks on 
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quantum field theory (all written by high energy physicists) continue to foster. For instance, 
the study of driven surface growth provides a particularly clear, transparent, and physical 
example of the importance of the renormalization group in quantum field theory. Instead 
of being entangled in all sorts of conceptual irrelevancies such as divergences, we have 
the obviously physical notion of changing the ruler used to measure the fluctuating 
surface. Other examples include random matrix theory and Chern-Simons gauge theory 
in quantum Hall fluids. I hope that condensed matter theory students will find this book 
helpful in getting a first taste of quantum field theory. The book is divided into eight parts, 1 
with two devoted more or less exclusively to condensed matter physics. 

I try to give the reader at least a brief glimpse into contemporary developments, for 
example, just enough of a taste of string theory to whet the appetite. This book is perhaps 
also exceptional in incorporating gravity from the beginning. Some topics are treated quite 
differently than in traditional texts. I introduce the Faddeev-Popov method to quantize 
electromagnetism and the language of differential forms to develop Yang-Mills theory, for 
example. 

The emphasis is resoundingly on the conceptual rather than the computational. The 
only calculation I carry out in all its gory details is that of the magnetic moment of the 
electron. Throughout, specific examples rather than heavy abstract formalism will be 
favored. Instead of dealing with the most general case, I always opt for the simplest. 

I had to struggle constantly between clarity and wordiness. In trying to anticipate and to 
minimize what would confuse the reader, I often find that I have to belabor certain points 
more than what I would like. 

I tried to avoid the dreaded phrase “It can be shown that ...” as much as possible. 
Otherwise, I could have written a much thinner book than this! There are indeed thinner 
books on quantum field theory: I looked at a couple and discovered that they hardly explain 
anything. I must confess that I have an almost insatiable desire to explain. 

As the manuscript grew, the list of topics that I reluctantly had to drop also kept growing. 
So many beautiful results, but so little space! It almost makes me ill to think about all the 
stuff (bosonization, instanton, conformal field theory, etc., etc.) I had to leave out. As one 
colleague remarked, the nutshell is turning into a coconut shell! 

Shelley Glashow once described the genesis of physical theories: “Tapestries are made 
by many artisans working together. The contributions of separate workers cannot be 
discerned in the completed work, and the loose and false threads have been covered over.” I 
regret that other than giving a few tidbits here and there I could not go into the fascinating 
history of quantum field theory, with all its defeats and triumphs. On those occasions 
when I refer to original papers I suffer from that disconcerting quirk of human psychology 
of tending to favor my own more than decorum might have allowed. I certainly did not 
attempt a true bibliography. 


1 Murray Gell-Mann used to talk about the eightfold way to wisdom and salvation in Buddhism (M. Gell-Mann 
and Y. Ne’eman, The Eightfold Way). Readers familiar with contemporary Chinese literature would know that the 
celestial dragon has eight parts. 
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The genesis of this book goes back to the quantum field theory course I taught as a 
beginning assistant professor at Princeton University. I had the enormous good fortune 
of having Ed Witten as my teaching assistant and grader. Ed produced lucidly written 
solutions to the homework problems I assigned, to the extent that the next year I went 
to the chairman to ask “What is wrong with the TA I have this year? He is not half as 
good as the guy last year!” Some colleagues asked me to write up my notes for a much 
needed text (those were the exciting times when gauge theories, asymptotic freedom, 
and scores of topics not to be found in any texts all had to be learned somehow) but a 
wiser senior colleague convinced me that it might spell disaster for my research career. 
Decades later, the time has come. I particularly thank Murph Goldberger for urging me 
to turn what expository talents I have from writing popular books to writing textbooks. It 
is also a pleasure to say a word in memory of the late Sam Treiman, teacher, colleague, 
and collaborator, who as a member of the editorial board of Princeton University Press 
persuaded me to commit to this project. I regret that my slow pace in finishing the book 
deprived him of seeing the finished product. 

Over the years I have refined my knowledge of quantum field theory in discussions 
with numerous colleagues and collaborators. As a student, I attended courses on quan¬ 
tum field theory offered by Arthur Wightman, Julian Schwinger, and Sidney Coleman. I 
was fortunate that these three eminent physicists each has his own distinctive style and 
approach. 

The book has been tested “in the field” in courses I taught. I used it in my field theory 
course at the University of California at Santa Barbara, and I am grateful to some of 
the students, in particular Ted Erler, Andrew Frey, Sean Roy, and Dean Townsley, for 
comments. I benefitted from the comments of various distinguished physicists who read 
all or parts of the manuscript, including Steve Barr, Doug Eardley, Matt Fisher, Murph 
Goldberger, Victor Gurarie, Steve Hsu, Bei-lolc Hu, Clifford Johnson, Mehran Kardar, Ian 
Low, Joe Polchinski, Arkady Vainshtein, Frank Wilczek, Ed Witten, and especially Joshua 
Feinberg. Joshua also did many of the exercises. 

Talking about exercises: You didn’t get this far in physics without realizing the absolute 
importance of doing exercises in learning a subject. It is especially important that you do 
most of the exercises in this book, because to compensate for its relative slimness I have 
to develop in the exercises a number of important points some of which I need for later 
chapters. Solutions to some selected problems are given. 

I will maintain a web page http://theory.lcitp.ucsb.edu/~zee/nuts.html listing all the 
errors, typographical and otherwise, and points of confusion that will undoubtedly come 
to my attention. 

I thank my editors, Trevor Lipscombe, Sarah Green, and the staff of Princeton Editorial 
Associates (particularly Cyd Westmoreland and Evelyn Grossberg) for their advice and for 
seeing this project through. Finally, I thank Peter Zee for suggesting the cover painting. 
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Preface to the Second Edition 


What one fool could understand, another can. 

—R. P. Feynman 1 


Appreciating the appreciators 

It has been nearly six years since this book was published on March 10, 2003. Since authors 
often think of books as their children, I may liken the flood of appreciation from readers, 
students, and physicists to the glorious report cards a bright child brings home from 
school. Knowing that there are people who appreciate the care and clarity crafted into the 
pedagogy is a most gratifying feeling. In working on this new edition, merely looking at 
the titles of the customer reviews on Amazon.com would lighten my task and quicken my 
pace: “Funny, chatty, physical. QFT education transformed!,” “A readable, and re-readable 
instant classic on QFT,” “A must read book if you want to understand essentials in QFT,” 
“One of the most artistic and deepest books ever written on quantum field theory,” “Perfect 
for learning field theory on your own,” “Both deep and entertaining,” “One of those books 
a person interested in theoretical physics simply must own,” and so on. 

In a Physics Today review, Zvi Bern, a preeminent younger field theorist, wrote: 

Perhaps foremost in his mind was how to make Quantum Field Theory in a Nutshell as much fun 
as possible. ... I have not had this much fun with a physics book since reading The Feynman 
Lectures on Physics. . . . [This is a book] that no student of quantum field theory should be 
without. Quantum Field Theory in a Nutshell is the ideal book for a graduate student to curl up 
with after having completed a course on quantum mechanics. But, mainly, it is for anyone who 
wishes to experience the sheer beauty and elegance of quantum field theory. 

A classical Chinese scholar famously lamented “He who knows me are so few!” but here 
Zvi read my mind. 

Einstein proclaimed, “Physics should be made as simple as possible, but not any 
simpler.” My response would be “Physics should be made as fun as possible, but not 


1 R. P. Feynman, QED: The Strange Theory of Light and Matter, p. xx. 
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any funnier.” I overcame the editor’s reluctance and included jokes and stories. And yes, I 
have also written a popular book Fearful Symmetry about the “sheer beauty and elegance” 
of modern physics, which at least in that book largely meant quantum field theory. I want 
to share that sense of fun and beauty as much as possible. I’ve heard some people say that 
“Beauty is truth” but “Beauty is fun” is more like it. 

I had written books before, but this was my first textbook. The challenges and rewards 
in writing different types of book are certainly different, but to me, a university professor 
devoted to the ideals of teaching, the feeling of passing on what I have learned and 
understood is simply incomparable. (And the nice part is that I don’t have to hand out 
final grades.) It may sound corny, but I owe it, to those who taught me and to those 
authors whose field theory texts I studied, to give something back to the theoretical physics 
community. It is a wonderful feeling for me to meet young hotshot researchers who had 
studied this text and now know more about field theory than I do. 


How I made the book better: The first text that covers the twenty-first century 

When my editor Ingrid Gnerlich asked me for a second edition I thought long and hard 
about how to make this edition better than the first. I have clarified and elaborated here 
and there, added explanations and exercises, and done more “practical” Feynman diagram 
calculations to appease those readers of the first edition who felt that I didn’t calculate 
enough. There are now three more chapters in the main text. I have also made the “most 
accessible” text on quantum field theory even more accessible by explaining stuff that 
I thought readers who already studied quantum mechanics should know. For example, 
I added a concise review of the Dirac delta function to chapter 1.2. But to the guy on 
Amazon.com who wanted complex analysis explained, sorry, I won’t do it. There is a limit. 
Already, I gave a basically self-contained coverage of group theory. 

More excitingly, and to make my life more difficult, I added, to the existing eight parts 
(of the celestial dragon), a new part consisting of four chapters, covering field theoretic 
happenings of the last decade or so. Thus I can say that this is the first text since the birth 
of quantum field theory in the late 1920s that covers the twenty-first century. 

Quantum field theory is a mature but certainly not a finished subject, as some stu¬ 
dents mistakenly believe. As one of the deepest constructs in theoretical physics and all 
encompassing in its reach, it is bound to have yet unplumbed depths, secret subterranean 
connections, and delightful surprises. While many theoretical physicists have moved past 
quantum field theory to string theory and even string field theory, they often take the limit 
in which the string description reduces to a field description, thus on occasion revealing 
previously unsuspected properties of quantum field theories. We will see an example in 
chapter N.4. 

My friends admonished me to maintain, above all else, the “delightful tone” of the first 
edition. I hope that I have succeeded, even though the material contained in part N is “hot 
off the stove” stuff, unlike the long-understood material covered in the main text. I also 
added a few jokes and stories, such as the one about Fermi declining to trace. 
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As with the first edition, I will maintain a web site http://theory.lcitp.ucsb.edu/~zee/ 
nuts2.html listing the errors, typographical or otherwise, that will undoubtedly come to 
my attention. 


Encouraging words 

In the quote that started this preface, Feynman was referring to himself, and to you! Of 
course, Feynman didn’t simply understand the quantum field theory of electromagnetism, 
he also invented a large chunk of it. To paraphrase Feynman, I wrote this book for fools 
like you and me. If a fool like me could write a book on quantum field theory, then surely 
you can understand it. 

As I said in the preface to the first edition, I wrote this book for those who, having 
learned quantum mechanics, are eager to tackle quantum field theory. During a sabbatical 
year (2006-07) I spent at Harvard, I was able to experimentally verify my hypothesis that 
a person who has mastered quantum mechanics could understand my book on his or her 
own without much difficulty. I was sent a freshman who had taught himself quantum 
mechanics in high school. I gave him my book to read and every couple of weeks or so 
he came by to ask a question or two. Even without these brief sessions, he would have 
understood much of the book. In fact, at least half of his questions stem from the holes 
in his knowledge of quantum mechanics. I have incorporated my answers to his field 
theoretic questions into this edition. 

As I also said in the original preface, I had tested some of the material in the book “in the 
field” in courses I taught at Princeton University and later at the University of California at 
Santa Barbara. Since 2003,1 have been gratified to know that it has been used successfully 
in courses at many institutions. 

I understand that, of the different groups of readers, those who are trying to learn 
quantum field theory on their own could easily get discouraged. Let me offer you some 
cheering words. First of all, that is very admirable of you! Of all the established subjects 
in theoretical physics, quantum field theory is by far the most subtle and profound. By 
consensus it is much much harder to learn than Einstein’s theory of gravity, which in fact 
should properly be regarded as part of field theory, as will be made clear in this book. So 
don’t expect easy cruising, particularly if you don’t have someone to clarify things for you 
once in a while. Try an online physics forum. Do at least some of the exercises. Remember: 
“No one expects a guitarist to learn to play by going to concerts in Central Park or by 
spending hours reading transcriptions of Jimi Hendrix solos. Guitarists practice. Guitarists 
play the guitar until their fingertips are calloused. Similarly, physicists solve problems.” 2 
Of course, if you don’t have the prerequisites, you won’t be able to understand this or any 
other field theory text. But if you have mastered quantum mechanics, keep on trucking 
and you will get there. 


2 N. Newbury et al., Princeton Problems in Physics with Solutions, Princeton University Press, Princeton, 1991. 
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The view will be worth it, I promise. My thesis advisor Sidney Coleman used to start his 
field theory course saying, “Not only God knows, I know, and by the end of the semester, 
you will know.” By the end of this book, you too will know how God weaves the universe 
out of a web of interlocking fields. I would like to change Dirac’s statement “God is a 
mathematician” to “God is a quantum field theorist.” 

Some of you steady truckers might want to ask what to do when you get to the end. Dur¬ 
ing my junior year in college, after my encounter with Mandl, I asked Arthur Wightman 
what to read next. He told me to read the textbook by S. S. Schweber, which at close to a 
thousand pages was referred to by students as “the monster” and which could be extremely 
opaque at places. After I slugged my way to the end, Wightman told me, “Read it again.” 
Fortunately for me, volume I of Bjorken and Drell had already come out. But there is wis¬ 
dom in reading a book again; things that you miss the first time may later leap out at you. 
So my advice is “Read it again.” Of course, every physics student also knows that different 
explanations offered by different books may click with different folks. So read other field 
theory books. Quantum field theory is so profound that most people won’t get it in one 
pass. 

On the subject of other field theory texts: James Bjorken kindly wrote in my much-used 
copy of Bjorken and Drell that the book was obsolete. Hey BJ, it isn’t. Certainly, volume I 
will never be passe. On another occasion, Steve Weinberg told me, referring to his field 
theory book, that “I wrote the book that I would have liked to learn from.” I could equally 
well say that “I wrote the book that I would have liked to learn from.” Without the least 
bit of hubris, I can say that I prefer my book to Schweber’s. The moral here is that if you 
don’t like this book you should write your own. 


I try not to do clunky 

I explained my philosophy in the preface to the first edition, but allow me a few more 
words here. I will teach you how to calculate, but I also have what I regard as a higher aim, 
to convey to you an enjoyment of quantum field theory in all its splendors (and by “all” I 
mean not merely quantum field theory as defined by some myopic physicists as applicable 
only to particle physics). I try to erect an elegant and logically tight framework and put a 
light touch on a heavy subject. 

In spite of the image conjured up by Zvi Bern of some future field theorist curled up 
in bed reading this book, I expect you to grab pen and paper and work. You could do 
it in bed if you want, but work you must. I intentionally did not fill in all the steps; it 
would hardly be a light touch if I do every bit of algebra for you. Nevertheless, I have done 
algebra when I think that it would help you. Actually, I love doing algebra, particularly 
when things work out so elegantly as in quantum field theory. But I don’t do clunky. I 
do not like clunky-looking equations. I avoid spelling everything out and so expect you 
to have a certain amount of “sense.” As a small example, near the end of chapter 1.10 I 
suppressed the spacetime dependence of the fields <p a and 8(p a . If you didn’t realize, after 
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some 70 pages, that fields are functions of where you are in spacetime, you are quite lost, 
my friend. My plan is to “keep you on your toes” and I purposely want you to feel puzzled 
occasionally. I have faith that the sort of person who would be reading this book can always 
figure it out after a bit of thought. I realize that there are at least three distinct groups of 
readers, but let me say to the students, “How do you expect to do research if you have to 
be spoon-fed from line to line in a textbook?” 


Nuts who do not appreciate the Nutshell 

In the original preface, I quoted Ricky Nelson on the impossibility of pleasing everyone and 
so I was not at all surprised to find on Amazon.com a few people whom one of my friends 
calls “nuts who do not appreciate the Nutshell.’’ My friends advise me to leave these people 
alone but I am sufficiently peeved to want to say a few words in my defense, no matter how 
nutty the charge. First, I suppose that those who say the book is too mathematical cancel 
out those who say the book is not mathematical enough. The people in the first group are 
not informed, while those in the second group are misinformed. 

Quantum field theory does not have to be mathematical. I know of at least three Field 
Medalists who enjoyed the book. A review for the American Mathematical Society offered 
this deep statement in praise of the book: “It is often deeper to know why something is 
true rather than to have a proof that it is true.” (Indeed, a Fields Medalist once told me that 
top mathematicians secretly think like physicists and after they work out the broad outline 
of a proof they then dress it up with epsilons and deltas. I have no idea if this is true only 
for one, for many, or for all Fields Medalists. I suspect that it is true for many.) 

Then there is the person who denounces the book for its lack of rigor. Well, I happen to 
know, or at least used to know, a thing or two about mathematical rigor, since I wrote my 
senior thesis with Wightman on what I would call “fairly rigorous” quantum field theory. 
As we like to say in the theoretical physics community, too much rigor soon leads to rigor 
mortis. Be warned. Indeed, as Feynman would tell students, if this ain’t rigorous enough 
for you the math department is just one building over. So read a more rigorous book. It is 
a free country. 

More serious is the impression that several posters on Amazon.com have that the book is 
too elementary. I humbly beg to differ. The book gives the impression of being elementary 
but in fact covers more material than many other texts. If you master everything in the 
Nutshell, you would know more than most professors of field theory and could start doing 
research. I am not merely making an idle claim but could give an actual proof. All the 
ingredients that went into the spinor helicity formalism that led to a deep field theoretic 
discovery described in part N could be found in the first edition of this book. Of course, 
reading a textbook is not enough; you have to come up with the good ideas. 

As for he who says that the book does not look complicated enough and hence can’t be 
a serious treatment, I would ask him to compare a modern text on electromagnetism with 
Maxwell’s treatises. 



xxiv | Preface to the Second Edition 


Thanks 

In the original preface and closing words, I mentioned that I learned a great deal of quan¬ 
tum field theory from Sidney Coleman. His clarity of thought and lucid exposition have 
always inspired me. Unhappily, he passed away in 2007. After this book was published, I 
visited Sidney on different occasions, but sadly, he was already in a mental fog. 

In preparing this second edition, I am grateful to Nima Arkani-Hamed, Yoni Ben-Tov, 
Nathan Berkovits, Marty Einhorn, Joshua Feinberg, Howard Georgi, Tim Hsieh, Brendan 
Keller, Joe Polchinski, Yong-shi Wu, and Jean-Bernard Zuber for their helpful comments. 
Some of them read parts or all of the added chapters. I thank especially Zvi Bern and 
Rafael Porto for going over the chapters in part N with great care and for many useful 
suggestions. I also thank Craig Kunimoto, Richard Neher, Matt Pillsbury, and Rafael 
Porto for teaching me the black art of composing equations on the computer. My editor 
at Princeton University Press, Ingrid Gnerlich, has always been a pleasure to talk to 
and work with. I also thank Kathleen Cioffi and Cyd Westmoreland for their meticulous 
work in producing this book. Last but not least, I am grateful to my wife Janice for her 
encouragement and loving support. 



Convention, Notation, and Units 


For the same reason that we no longer use a certain Icing’s feet to measure distance, we use 
natural units in which the speed of light c and the Dirac symbol H are both set equal to 1. 
Planck made the profound observation that in natural units all physical quantities can be 
expressed in terms of the Planck mass M P i anc k = l/VG Newton ~ 10 19 Gev. The quantities 
c and h are not so much fundamental constants as conversion factors. In this light, I am 
genuinely puzzled by condensed matter physicists carrying around Boltzmann’s constant 
k, which is no different from the conversion factor between feet and meters. 

Spacetime coordinates x 1 * are labeled by Greek indices (q, = 0, 1, 2, 3 ) with the time 
coordinate x° sometimes denoted by t. Space coordinates x' are labeled by Latin indices 
(i = 1, 2, 3) and 3^ = d/dx^ . We use a Minkowski metric r] tlv with signature ) 

so that i] 00 — +1. We write ^ iv d^ L (pd v (p — d^cpd^cp = (3 cp) 2 — (dcp/dt) 2 — £T(3<p/3x ! ) 2 . The 
metric in curved spacetime is always denoted by g 111 ’, but often I will also use g^ lv for the 
Minkowski metric when the context indicates clearly that we are in flat spacetime. 

Since I will be talking mostly about relativistic quantum field theory in this book I will 
without further clarification use a relativistic language. Thus, when I speak of momentum, 
unless otherwise specified, I mean energy and momentum. Also since H = 1,1 will not 
distinguish between wave vector k and momentum, and between frequency m and energy. 

In local field theory I deal primarily with the Lagrangian density C and not the La- 
grangian L = f d 2 x C . As is common practice in the literature and in oral discussion, I 
will often abuse terminology and simply refer to C as the Lagrangian. I will commit other 
minor abuses such as writing 1 instead of I for the unit matrix. I use the same symbol 
q> for the Fourier transform cp(k) of a function ip{x) whenever there is no risk of confu¬ 
sion, as is almost always the case. I prefer an abused terminology to cluttered notation and 
unbearable pedantry. 

The symbol * denotes complex conjugation, and | hermitean conjugation: The former 
applies to a number and the latter to an operator. I also use the notation c.c. and h.c. Often 
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when there is no risk of confusion I abuse the notation, using f when I should use *. For 
instance, in a path integral, bosonic fields are just number-valued fields, but nevertheless 
I write (p ] rather than <p*. For a matrix M, then of course M { and M* should be carefully 
distinguished from each other. 

I made an effort to get factors of 2 and tt right, but some errors will be inevitable. 
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Who Needs It? 


Who needs quantum field theory? 

Quantum field theory arose out of our need to describe the ephemeral nature of life. 

No, seriously, quantum field theory is needed when we confront simultaneously the two 
great physics innovations of the last century of the previous millennium: special relativity 
and quantum mechanics. Consider a fast moving rocket ship close to light speed. You need 
special relativity but not quantum mechanics to study its motion. On the other hand, to 
study a slow moving electron scattering on a proton, you must invoice quantum mechanics, 
but you don’t have to know a thing about special relativity. 

It is in the peculiar confluence of special relativity and quantum mechanics that a new 
set of phenomena arises: Particles can be born and particles can die. It is this matter of 
birth, life, and death that requires the development of a new subject in physics, that of 
quantum field theory. 

Let me give a heuristic discussion. In quantum mechanics the uncertainty principle tells 
us that the energy can fluctuate wildly over a small interval of time. According to special 
relativity, energy can be converted into mass and vice versa. With quantum mechanics and 
special relativity, the wildly fluctuating energy can metamorphose into mass, that is, into 
new particles not previously present. 

Write down the Schrodinger equation for an electron scattering off a proton. The 
equation describes the wave function of one electron, and no matter how you shake 
and bake the mathematics of the partial differential equation, the electron you follow 
will remain one electron. But special relativity tells us that energy can be converted to 
matter: If the electron is energetic enough, an electron and a positron (“the antielectron”) 
can be produced. The Schrodinger equation is simply incapable of describing such a 
phenomenon. Nonrelativistic quantum mechanics must break down. 

You saw the need for quantum field theory at another point in your education. Toward 
the end of a good course on nonrelativistic quantum mechanics the interaction between 
radiation and atoms is often discussed. You would recall that the electromagnetic field is 
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Figure 1.1.1 


treated as a field; well, it is a field. Its Fourier components are quantized as a collection 
of harmonic oscillators, leading to creation and annihilation operators for photons. So 
there, the electromagnetic field is a quantum field. Meanwhile, the electron is treated as a 
poor cousin, with a wave function 'F(v) governed by the good old Schrodinger equation. 
Photons can be created or annihilated, but not electrons. Quite aside from the experimental 
fact that electrons and positrons could be created in pairs, it would be intellectually more 
satisfying to treat electrons and photons, as they are both elementary particles, on the same 
footing. 

So, I was more or less right: Quantum field theory is a response to the ephemeral nature 
of life. 

All of this is rather vague, and one of the purposes of this book is to make these remarks 
more precise. For the moment, to make these thoughts somewhat more concrete, let us 
ask where in classical physics we might have encountered something vaguely resembling 
the birth and death of particles. Think of a mattress, which we idealize as a 2-dimensional 
lattice of point masses connected to each other by springs (fig. 1.1.1). For simplicity, let 
us focus on the vertical displacement [which we denote by q a (t)] of the point masses and 
neglect the small horizontal movement. The index a simply tells us which mass we are 
talking about. The Lagrangian is then 

L =\ <X m 4a ~ X k abq a 1b ~ X Sabc^aWc - ) f 1 ) 

a a,b a,b,c 

Keeping only the terms quadratic in q (the “harmonic approximation”) we have the equa¬ 
tions of motion mq a = — ^ h k ab q h . Talcing the q’s as oscillating with frequency &>, we 
have J2b kabQb = mo) 2 q a . The eigenfrequencies and eigenmodes are determined, respec¬ 
tively, by the eigenvalues and eigenvectors of the matrix k. As usual, we can form wave 
packets by superposing eigenmodes. When we quantize the theory, these wave packets be¬ 
have like particles, in the same way that electromagnetic wave packets when quantized 
behave like particles called photons. 
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Since the theory is linear, two wave packets pass right through each other. But once we 
include the nonlinear terms, namely the terms cubic, quartic, and so forth in the q’s in 
(1), the theory becomes anharmonic. Eigenmodes now couple to each other. A wave packet 
might decay into two wave packets. When two wave packets come near each other, they 
scatter and perhaps produce more wave packets. This naturally suggests that the physics 
of particles can be described in these terms. 

Quantum field theory grew out of essentially these sorts of physical ideas. 

It struck me as limiting that even after some 75 years, the whole subject of quantum 
field theory remains rooted in this harmonic paradigm, to use a dreadfully pretentious 
word. We have not been able to get away from the basic notions of oscillations and wave 
packets. Indeed, string theory, the heir to quantum field theory, is still firmly founded on 
this harmonic paradigm. Surely, a brilliant young physicist, perhaps a reader of this book, 
will take us beyond. 


Condensed matter physics 

In this book I will focus mainly on relativistic field theory, but let me mention here that 
one of the great advances in theoretical physics in the last 30 years or so is the increasingly 
sophisticated use of quantum field theory in condensed matter physics. At first sight this 
seems rather surprising. After all, a piece of “condensed matter” consists of an enormous 
swarm of electrons moving nonrelativistically, knocking about among various atomic ions 
and interacting via the electromagnetic force. Why can’t we simply write down a gigantic 
wave function 'T(x 1 , xj, ■ ■ ■, x^), where X; denotes the position of the yth electron and N 
is a large but finite number? Okay, T* is a function of many variables but it is still governed 
by a nonrelativistic Schrodinger equation. 

The answer is yes, we can, and indeed that was how solid state physics was first studied 
in its heroic early days (and still is in many of its subbranches). 

Why then does a condensed matter theorist need quantum field theory? Again, let us 
first go for a heuristic discussion, giving an overall impression rather than all the details. In 
a typical solid, the ions vibrate around their equilibrium lattice positions. This vibrational 
dynamics is best described by so-called phonons, which correspond more or less to the 
wave packets in the mattress model described above. 

This much you can read about in any standard text on solid state physics. Furthermore, 
if you have had a course on solid state physics, you would recall that the energy levels 
available to electrons form bands. When an electron is kicked (by a phonon field say) from 
a filled band to an empty band, a hole is left behind in the previously filled band. This 
hole can move about with its own identity as a particle, enjoying a perfectly comfortable 
existence until another electron comes into the band and annihilates it. Indeed, it was 
with a picture of this kind that Dirac first conceived of a hole in the “electron sea” as the 
antiparticle of the electron, the positron. 

We will flesh out this heuristic discussion in subsequent chapters in parts V and VI. 
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Marriages 


To summarize, quantum field theory was born of the necessity of dealing with the marriage 
of special relativity and quantum mechanics, just as the new science of string theory is 
being born of the necessity of dealing with the marriage of general relativity and quantum 
mechanics. 




Path Integral Formulation of Quantum Physics 


The professor’s nightmare: a wise guy in the class 

As I noted in the preface, I know perfectly well that you are eager to dive into quantum field 
theory, but first we have to review the path integral formalism of quantum mechanics. This 
formalism is not universally taught in introductory courses on quantum mechanics, but 
even if you have been exposed to it, this chapter will serve as a useful review. The reason I 
start with the path integral formalism is that it offers a particularly convenient way of going 
from quantum mechanics to quantum field theory. I will first give a heuristic discussion, 
to be followed by a more formal mathematical treatment. 

Perhaps the best way to introduce the path integral formalism is by telling a story, 
certainly apocryphal as many physics stories are. Long ago, in a quantum mechanics class, 
the professor droned on and on about the double-slit experiment, giving the standard 
treatment. A particle emitted from a source S (fig. 1.2.1) at time t — 0 passes through one 
or the other of two holes, A 1 and A 2 , drilled in a screen and is detected at time t = T by 
a detector located at O. The amplitude for detection is given by a fundamental postulate 
of quantum mechanics, the superposition principle, as the sum of the amplitude for the 
particle to propagate from the source S through the hole Aj and then onward to the point 
O and the amplitude for the particle to propagate from the source S through the hole A 2 
and then onward to the point O. 

Suddenly, a very bright student, let us call him Feynman, asked, “Professor, what if 
we drill a third hole in the screen?” The professor replied, “Clearly, the amplitude for 
the particle to be detected at the point O is now given by the sum of three amplitudes, 
the amplitude for the particle to propagate from the source S through the hole A 1 and 
then onward to the point O, the amplitude for the particle to propagate from the source S 
through the hole A 2 and then onward to the point O, and the amplitude for the particle to 
propagate from the source S through the hole A 3 and then onward to the point O.” 

The professor was just about ready to continue when Feynman interjected again, “What 
if I drill a fourth and a fifth hole in the screen?” Now the professor is visibly losing his 
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Figure 1.2.1 


patience: “All right, wise guy, I think it is obvious to the whole class that we just sum over 
all the holes.” 

To make what the professor said precise, denote the amplitude for the particle to 
propagate from the source S through the hole A,- and then onward to the point O as 
AUS -> A,- —> O). Then the amplitude for the particle to be detected at the point O is 

Afdetected at 0)= ^ A(S -»■ A t O) (1) 

i 

But Feynman persisted, “What if we now add another screen (fig. 1.2.2) with some holes 
drilled in it?” The professor was really losing his patience: “Look, can’t you see that you 
just take the amplitude to go from the source S to the hole A,- in the first screen, then to 
the hole B in the second screen, then to the detector at O , and then sum over all i and j ?” 

Feynman continued to pester, “What if I put in a third screen, a fourth screen, eh? What 
if I put in a screen and drill an infinite number of holes in it so that the screen is no longer 
there?” The professor sighed, “Let’s move on; there is a lot of material to cover in this 
course.” 



Figure 1.2.2 
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Figure 1.2.3 


But dear reader, surely you see what that wise guy Feynman was driving at. I especially 
enjoy his observation that if you put in a screen and drill an infinite number of holes in it, 
then that screen is not really there. Very Zen! What Feynman showed is that even if there 
were just empty space between the source and the detector, the amplitude for the particle 
to propagate from the source to the detector is the sum of the amplitudes for the particle to 
go through each one of the holes in each one of the (nonexistent) screens. In other words, 
we have to sum over the amplitude for the particle to propagate from the source to the 
detector following all possible paths between the source and the detector (fig. 1.2.3). 

„4 (particle to go from S to O in time T) — 

A (particle to go from S to O in time T following a particular path) (2) 

(paths) 

Now the mathematically rigorous will surely get anxious over how ^ (paths) is to be 
defined. Feynman followed Newton and Leibniz: Take a path (fig. 1.2.4), approximate it 
by straight line segments, and let the segments go to zero. You can see that this is just like 
filling up a space with screens spaced infinitesimally close to each other, with an infinite 
number of holes drilled in each screen. 

Fine, but how to construct the amplitude A (particle to go from S to O in time T following 
a particular path)? Well, we can use the unitarity of quantum mechanics: If we know the 
amplitude for each infinitesimal segment, then we just multiply them together to get the 
amplitude of the whole path. 



Figure 1.2.4 
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In quantum mechanics, the amplitude to propagate from a point q l to a point q F in 
time T is governed by the unitary operator e~ lHT , where H is the Hamiltonian. More 
precisely, denoting by | q) the state in which the particle is at q, the amplitude in question 
is just (q F \e~' HT | q t ). Here we are using the Dirac bra and ket notation. Of course, 
philosophically, you can argue that to say the amplitude is (q F \ e~ lHT \q/) amounts to a 
postulate and a definition of H. It is then up to experimentalists to discover that H is 
hermitean, has the form of the classical Hamiltonian, et cetera. 

Indeed, the whole path integral formalism could be written down mathematically start¬ 
ing with the quantity (q F \ e~' HT \qj), without any of Feynman’s jive about screens with an 
infinite number of holes. Many physicists would prefer a mathematical treatment without 
the talk. As a matter of fact, the path integral formalism was invented by Dirac precisely 
in this way, long before Feynman. 1 

A necessary word about notation even though it interrupts the narrative flow: We denote 
the coordinates transverse to the axis connecting the source to the detector by q , rather 
than x, for a reason which will emerge in a later chapter. For notational simplicity, we will 
think of q as 1-dimensional and suppress the coordinate along the axis connecting the 
source to the detector. 


Dirac’s formulation 

Let us divide the time T into N segments each lasting St — T/N. Then we write 

[q F \ e~ iHT I q,) = {q F \e~ imt e~ imt ■ • • e~ imt \q,) 

Our states are normalized by ( q'\ q) — S(q' — q) with S the Dirac delta function. (Recall 
that S is defined by 8(q) = f^(dp/2n)e ,pq and / dq8(q) — 1. See appendix 1.) Now use 
the fact that |g) forms a complete set of states so that f dq \q)(q\ = 1. To see that the 
normalization is correct, multiply on the left by { q"\ and on the right by \q'), thus obtaining 
/ dqS(q" — q)S(q — q') — S(q" — q f ). Insert 1 between all these factors of e ~ ,HS ' and write 

(q F \e~ ,HT \q,) 

N -1 

=<n 

j =i 

Focus on an individual factor (</ /+ il e~' HSt |<y,). Let us take the baby step of first eval¬ 
uating it just for the free-particle case in which H — p 2 /2m. The hat on p reminds us 
that it is an operator. Denote by \p) the eigenstate of p, namely p \p) — p \p). Do you re¬ 
member from your course in quantum mechanics that (q\p) = e ' pq ? Sure you do. This 


J dqj)(qp\e lHSr \qN-i)( c lN-i\e ,HSt \qN-2) ' '' (^2! e ' HS ‘\ c h)(<h\ e ,HSt \ qj) (3) 


1 For the true history of the path integral, see p. xv of my introduction to R. P. Feynman, QED: The Strange 
Theory of Light and Matter. 
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just says that the momentum eigenstate is a plane wave in the coordinate representa¬ 
tion. (The normalization is such that f (dp/lrc) \p)(p\ = 1. Again, to see that the nor¬ 
malization is correct, multiply on the left by (q'\ and on the right by | q), thus obtaining 
/ (dp/2n)e ,p< ' q = S(q’ — q).) So again inserting a complete set of states, we write 

= f ^ (q j+1 \e~ iSt ^ 2m) \p) (p\qj) 

= j ^- mp 2 / 2 m) (q j+ i\p)(p\qj) 

— f iE_ e -i&t<.P 2 /2m)gip(aj+i-qj) 

J 2 71 

Note that we removed the hat from the momentum operator in the exponential: Since the 
momentum operator is acting on an eigenstate, it can be replaced by its eigenvalue. Also, 
we are evidently working in the Heisenberg picture. 

The integral over p is known as a Gaussian integral, with which you may already be 
familiar. If not, turn to appendix 2 to this chapter. 

Doing the integral over p, we get (using (21)) 

(q j+ il e - iSnii2/2m) I qj) = (yy) 2 e [im(q }+ 1 ~ q i )2]/2St 

— { ~ im 1 2 e iSHm/2)l(q j+1 -q])/St] 2 
\2nSt) 

Putting this into (3) yields 

k,> -(^) f (n/ 

with q 0 = qj and q N = q F . 

We can now go to the continuum limit St —> 0. Newton and Leibniz taught us to replace 
[{qj _|_i — qj)/St] 2 by q 2 , and St ^ fo Finally, we define the integral over paths 

as 

We thus obtain the path integral representation 

{q F \e~‘ HT \q,) = J Dq{t) e fo d, i' nq (4) 

This fundamental result tells us that to obtain (q F \ e~ lHT \q t ) we simply integrate over 
all possible paths q(t) such that q{Q) — q I and q{T) — q F . 

As an exercise you should convince yourself that had we started with the Hamiltonian 
for a particle in a potential H — p 2 /2m + V(q) (again the hat on q indicates an operator) 
the final result would have been 

(q F \e~ iHT \q,) =f Dq(t) e‘ fo 


( 5 ) 
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We recognize the quantity \mq 2 — V(g)as just the Lagrangian L(q, q ). The Lagrangian 
has emerged naturally from the Hamiltonian! In general, we have 

(q F \e~ iHT \q,) = J Dq(t)e‘fo d,L ^' q) ( 6 ) 

To avoid potential confusion, let me be clear that 1 appears as an integration variable in 
the exponential on the right-hand side. The appearance of t in the path integral measure 
Dq(t) is simply to remind us that q is a function of t (as if we need reminding). Indeed, 
this measure will often be abbreviated to Dq. You might recall that f 1 dtL(q, q) is called 
the action S(q ) in classical mechanics. The action S is a functional of the function q(t). 

Often, instead of specifying that the particle starts at an initial position qj and ends at 
a final position q F , we prefer to specify that the particle starts in some initial state I and 
ends in some final state F. Then we are interested in calculating (F \ e~ lHT |/), which upon 
inserting complete sets of states can be written as 

J dq F J dq l (F\q F ){q F \e~' HT \q,){qi\I), 

which mixing Schrodinger and Dirac notation we can write as 

J dq F J dq I 'l' F ^q F )*(q F \e~ ,HT \q I )'l> I (q I ). 

In most cases we are interested in taking |7) and | F) as the ground state, which we will 
denote by |0). It is conventional to give the amplitude (0| e~' HT |0) the name Z. 

At the level of mathematical rigor we are working with, we count on the path integral 

/ Dq(t)e'fo dt h mq V{q) } to converge because the oscillatory phase factors from different 
paths tend to cancel out. It is somewhat more rigorous to perform a so-called Wick rotation 
to Euclidean time. This amounts to substituting t —> — it and rotating the integration 
contour in the complex t plane so that the integral becomes 

Z = f Dq(t) e~ fo dtl ^ 2+v ^\ ( 7 ) 

known as the Euclidean path integral. As is done in appendix 2 to this chapter with ordinary 
integrals we will always assume that we can make this type of substitution with impunity. 


The classical world emerges 

One particularly nice feature of the path integral formalism is that the classical limit of 
quantum mechanics can be recovered easily. We simply restore Planck’s constant h in (6): 

(q F \e-W h)HT \q t ) =f Dq(t)e m) fo d,L ^ ) 

and take the h —> 0 limit. Applying the stationary phase or steepest descent method (if you 

C T 

don’t know it see appendix 3 to this chapter) we obtain Jo dtL( q, : ,q c )^ q c (t) is 

the “classical path” determined by solving the Euler-Lagrange equation ( d/dt)(8L/Sq ) — 
(. SL/Sq ) — 0 with appropriate boundary conditions. 
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Appendix ^ 


For your convenience, I include a concise review of the Dirac delta function here. Let us define a function d K {x) by 


/ 


dk 


Kx 


d K {x) = / —e' KX = — sin — 
2n jtx 


( 8 ) 


for arbitrary real values of x. We see that for large K the even function d K {x) is sharply peaked at the origin x = 0, 
reaching a value of K/lix at the origin, crossing zero at x = 2jz/K, and then oscillating with ever decreasing 
amplitude. Furthermore, 



dx d K (x) = 


2 

TV 



dx . Kx 

— sm- 

x 2 


2 dy . 

— — sm y 

n Jo y 


( 9 ) 


The Dirac delta function is defined by 8(x) = \im K ^, 00 d K (x). Heuristically, it could be thought of as an 
infinitely sharp spike located at x = 0 such that the area under the spike is equal to 1. Thus for a function sfx) 
well-behaved around x = a we have 



dx 8 (x — <2)s(x) = s(a ) 


( 10 ) 


(By the way, for what it is worth, mathematicians call the delta function a “distribution,” not a function.) 

Our derivation also yields an integral representation for the delta function that we will use repeatedly in this 
text: 


S(x) = 



c, k ikx 

2n 


( 11 ) 


We will often use the identity 

dx S(f(x))s(.x ) = (12) 

-oo • 1/ \Xi)\ 


where Xj denotes the zeroes of f(x) (in other words, /(*;) = 0 and f r (Xj) = df{xi)/dx.) To prove this, first show 
that dx 8(bx)s(x) = dx = s(0)/|Z?|. The factor of 1/b follows from dimensional analysis. (To 

see the need for the absolute value, simply note that 8 ( bx ) is a positive function. Alternatively, change integration 
variable to y = bx: for b negative we have to flip the integration limits.) To obtain (12), expand around each of 
the zeroes of fix). 

Another useful identity (understood in the limit in which the positive infinitesimal e tends to zero) is 

- 2 — =V- -in&{x) ( 13 ) 

x + ie x 

To see this, simply write l/(x + is) = x/(x 2 + s 2 ) — is/{x 2 + s 2 ), and then note that s/(x 2 + e 2 ) as a function 
of x is sharply spiked around x = 0 and that its integral from — oo to oo is equal to jt. Thus we have another 
representation of the Dirac delta function: 


S(x) = 


1 e 

n x 2 + e 2 


( 14 ) 


Meanwhile, the principal value integral is defined by 
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Appendix 2 


I will now show you how to do the integral G = dxe . The trick is to square the integral, call the dummy 
integration variable in one of the integrals y, and then pass to polar coordinates: 


, f+°° ! 2 f + °° 1 2 /*^ 

= dx e / dy e = 2 tt I 
7 — 00 J —00 7 o 

= 2 „f 

Jo 


dr re z r 


dw e w = 2n 


Thus, we obtain 

+OO 


/. 


_ 1 y-2 / 

dx e 2 = v 2n 


(16) 


Believe it or not, a significant fraction of the theoretical physics literature consists of varying and elaborating 
this basic Gaussian integral. The simplest extension is almost immediate: 


/: 


dx e~i ax ~ = I — V 

a J 


(17) 


as can be seen by scaling x —► x/^/a. 

Acting on this repeatedly with —2(d/da) we obtain 


(X 2n ) EE 


f^dxe-i^x* 

fZdxe-** 


= —(2« - l)(2n - 3) • • • 5 • 3 • 1 
a" 


(18) 


The factor 1 /a n follows from dimensional analysis. To remember the factor (2 n — 1)!! = {In — 1)(2 n — 3) • • • 5 • 
3 • 1 imagine 2 n points and connect them in pairs. The first point can be connected to one of (2 n — 1) points, the 
second point can now be connected to one of the remaining (2 n — 3) points, and so on. This clever observation, 
due to Gian Carlo Wick, is known as Wick’s theorem in the field theory literature. Incidentally, field theorists use 
the following graphical mnemonic in calculating, for example, (x 6 ) : Write (x 6 ) as (xxxxxx) and connect the x’s, 
for example 


( xxxxxx) 
I 


The pattern of connection is known as a Wick contraction. In this simple example, since the six x’s are identical, 
any one of the distinct Wick contractions gives the same value n -3 and the final result for (x 6 ) is just times 
the number of distinct Wick contractions, namely 5-3-1=15. We will soon come to a less trivial example, with 
distinct x’s, in which case distinct Wick contraction gives distinct values. 

An important variant is the integral 



dx e 2 


— i ax 2 +Jx 


a J 


(19) 


To see this, take the expression in the exponent and “complete the square”: — ax 1 /! + Jx = —{a/ 2)(x 2 — 
2 Jx/o) = —{a/ 2)(x — J/a) 2 + J 2 /la. The x integral can now be done by shifting x — > x + J/a, giving the 
factor of ( 2n/a )2. Check that we can also obtain (18) by differentiating with respect to J repeatedly and then 
setting 7 = 0. 

Another important variant is obtained by replacing 7 by i J : 

/ +°° , 9 

dx e ~2 ax +iJx = 

-oo 


2n 


— J l /2a 


a 


( 20 ) 
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To get yet another variant, replace a by — ia: 


1 : 


dx e\ iaxl + iJx = 


27 ri \ 2 


( 21 ) 


Let us promote a to a real symmetric N by N matrix A ;/ and x to a vector Xj (i, j = 1, • • • , N). Then (19) 
generalizes to 


/ +OJ r+oo M 
-oo 7— oo 7 — c 


’ t/4',t/x 2 • • • dx N e~\ x ' A ' x+J ' x - ( 2 e 1 2 J - A ~ Kj 

\det[A]/ 


( 22 ) 


where x • A • x = x^A^Xj and J • x = J t Xi (with repeated indices summed.) 

To derive this important relation, diagonalize A by an orthogonal transformation O so that A = 0~ l • D • O, 
where D is a diagonal matrix. Call = OijXj. In other words, we rotate the coordinates in the Af-dimensional 
Euclidean space we are integrating over. The expression in the exponential in the integrand then becomes 
-\y-D-y + (OJ)-y. Using /+“ • • ■ /+“ dx 1 ■ ■ ■ dx N = /+“ • • ■ /+“ dy x • • • dy N , we factorize the left-hand 
side of (22) into a product of N integrals, each of the form dyie~i D “ y i + (°A m plugging into (19) we 
obtain the right hand side of (22), since (O J) ■ D■ (OJ) = J ■ 0^ 1 D^ 1 0 ■ J = J ■ A -1 ■ J (where we use the 
orthogonality of O). (To make sure you got it, try this explicitly for N = 2.) 

Putting in some i’s {A—*—iA,J—>■ ij), we find the generalization of (22) 


/ +oo r-roo r -t 

-oo 2—00 J —c 


dxidx 2 ■ ■ ■ dx N e 


(i/ 2 )X'A‘X+iJ ‘X 


/ (27tl) N \ 1 
\ det[A] / 


(23) 


The generalization of (18) is also easy to obtain. Differentiate (22) p times with respect to J t , Jj, • • • J k , and 
//, and then set 7 = 0. For example, for p = 1 the integrand in (22) becomes e~^ x ‘ A ' x Xj and since the integrand 
is now odd in x,- the integral vanishes. For p = 2 the integrand becomes e~ i x ' A ' x (xiXj), while on the right hand 
side we bring down AT 1 . Rearranging and eliminating det[A] (by setting 7 = 0 in (22)), we obtain 


(XiXj) 


/r /_r • ■ • /r ^x 2 ... dXN e-\^ XiX] A _, 

f-~ /_“ • • • /-T dx-\dx 2 • • - dx N iJ 


Just do it. Doing it is easier than explaining how to do it. Then do it for p = 3 and 4. You will see immediately how 
your result generalizes. When the set of indices i, j , • • • ,k,l contains an odd number of elements, (XfXj • • • x k x{) 
vanishes trivially. When the set of indices /, j , • • •, k, l contains an even number of elements, we have 


(XiXj ■ ■ ■ x k x,) =J2 (A Ti- • • • (A J )«f 
Wick 

where we have defined 


{XjXj ■ ■ ■ X k Xi) 


r+oo 

2—00 


/*” • ■ • /*” dx^X 2 ■ ■ ■ dx N e i x ' A ' x XiXj ■ ■ ■ x k x t 
/_ + “ /_ + “ • • • /_ + “ dx t dx 2 • ■ • dx N 


(24) 


(25) 


and where the set of indices {a, b, • • •, c, d} represent a permutation of {/, j, ■ • •, k, /}. The sum in (24) is over 
all such permutations or Wick contractions. 

For example, 

(x iXj x kXl ) = (A- l ) tj {A~\, + (A~ 1 ) u (A~ 1 ) jk + {A~ 1 ) jk {A~ 1 )ji (26) 

(Recall that A, and thus A -1 , is symmetric.) As in the simple case when x does not carry any index, we could 
connect the x’s in ( XiXjX k xi) in pairs (Wick contraction) and write a factor (A _1 ) a ^ if we connect x a to x b . 

Notice that since (XjX y ) = (A -1 )jj the right hand side of (24) can also be written in terms of objects like (xjX y ). 
Thus, (. X t XjX k X t ) = (*,•*;><***/) + {XiX^iXjXk) + {XiX^iXjXi). 
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Please work out {x t x jX k XiX m x n )\ you will become an expert on Wick contractions. Of course, (24) reduces to 
(18) for N = l. 

Perhaps you are like me and do not like to memorize anything, but some of these formulas might be worth 
memorizing as they appear again and again in theoretical physics (and in this book). 


Appendix 3 


To do an exponential integral of the form I = dqe~^ h ^^ we often have to resort to the steepest-descent 

approximation, which I will now review for your convenience. In the limit of h small, the integral is dominated 
by the minimum of f(q). Expanding f(q) = f(a) + \ f"(a){q — a) 2 + 0[(q — a) 3 ] and applying (17) we obtain 


/ = e -(l/h)f(a) 


/"(«)/ 


(27) 


For f(q) a function of many variables q^, ..., q N and with a minimum at qj = cij, we generalize immediately 
to 


/ _ g —(1 /»)/(«) f ( 27 Th) N 

\det f"(a) 


1 

1 


e-°^ 


(28) 


Here f"(a ) denotes the N by N matrix with entries [f"{a)\ij = (3 2 //3^ J -3^)L =a . In many situations, we do 
not even need the factor involving the determinant in (28). If you can derive (28) you are well on your way to 
becoming a quantum field theorist! 


Exercises 

1 . 2.1 Verify (5). 


I. 2.2 Derive (24). 




From Mattress to Field 


The mattress in the continuum limit 


The path integral representation 

Z = <0| e~ iHT |0) = J Dqityjfo dt ll™q 2 -V(q)] (i) 

(we suppress the factor (0| qf){qi |0); we will come back to this issue later in this chapter) 
which we derived for the quantum mechanics of a single particle, can be generalized almost 
immediately to the case of N particles with the Hamiltonian 


"=E 

a 



V(q x ,q 2 ,---,q N ). 


( 2 ) 


We simply keep track mentally of the position of the particles q a with a — 1,2, ■■■, N. 
Going through the same steps as before, we obtain 


Z= (0| e~ iHT |0) 



Dq(t) e iS(q) 


with the action 

S{q)=( dl( ]T \m a q 2 a 

J0 a 


V[q x , q 2 , ■■■ ,q N 


')• 


( 3 ) 


The potential energy V(q±,q 2 , ■ ■ ■, qtj) now includes interaction energy between particles, 
namely terms of the form v(q a — q h ), as well as the energy due to an external potential, 
namely terms of the form w (q a ). In particular, let us now write the path integral description 
of the quantum dynamics of the mattress described in chapter 1.1, with the potential 


V(q 1 ,q 2 ,...,q N ) = Y2 

ab 



c lb ) 2 + • • • 


We are now just a short hop and skip away from a quantum field theory! Suppose we 
are only interested in phenomena on length scales much greater than the lattice spacing 
l (see fig. 1.1.1). Mathematically, we take the continuum limit l -» 0. In this limit, we can 




i8 | I. Motivation and Foundation 


replace the label a on the particles by a two-dimensional position vector x, and so we write 
q(t, x) instead of q a (t). It is traditional to replace the Latin letter q by the Greek letter q>. 
The function ip(t,x) is called a field. 

The kinetic energy \m a q 2 now becomes / d 2 x\o(dip/dt) 2 . We replace by 
/ d 2 x/1 2 and denote the mass per unit area m a /l 2 by a. We take all the m a ’s to be equal; 
otherwise a would be a function of x , the system would be inhomogeneous, and we would 
have a hard time writing down a Lorentz-invariant action (see later). 

We next focus on the first term in V. Assume for simplicity that k ah connect only nearest 
neighbors on the lattice. For nearest-neighbor pairs (q a — q h ) 2 ~ l 2 (dtp/dx) 2 + ■ ■ ■ in the 
continuum limit; the derivative is obviously taken in the direction that joins the lattice sites 
a and b. 

Putting it together then, we have 



where the parameter p is determined by k ah and /. The precise relations do not concern us. 

Henceforth in this book, we will take the T —> oo limit so that we can integrate over all 
of spacetime in (4). 

We can clean up a bit by writing p — ac 1 and scaling q> -> cp/J~a, so that the combination 
(dcp/dt) 2 — c 2 [(dtp/dx) 2 + (dq>/dy) 2 ] appears in the Lagrangian. The parameter c evidently 
has the dimension of a velocity and defines the phase velocity of the waves on our mattress. 

We started with a mattress for pedagogical reasons. Of course nobody believes that 
the fields observed in Nature, such as the meson field or the photon field, are actually 
constructed of point masses tied together with springs. The modern view, which I will call 
Landau-Ginzburg, is that we start with the desired symmetry, say Lorentz invariance if we 
want to do particle physics, decide on the fields we want by specifying how they transform 
under the symmetry (in this case we decided on a scalar field ip), and then write down 
the action involving no more than two time derivatives (because we don’t know how to 
quantize actions with more than two time derivatives). 

We end up with a Lorentz-invariant action (setting c = 1) 


f 


S= d a x 


\ w 2 - ~ m V - +■ 


( 5 ) 


where various numerical factors are put in for later convenience. The relativistic nota¬ 
tion (3 (p) 2 = d^ipd^cp — (dcp/dt) 2 — (dcp/dx) 2 — (dcp/dy) 2 was explained in the note on 
convention. The dimension of spacetime, d, clearly can be any integer, even though in 
our mattress model it was actually 3. We often write d — D + 1 and speak of a (D + 1)- 
dimensional spacetime. 

We see here the power of imposing a symmetry. Lorentz invariance together with the 
insistence that the Lagrangian involve only at most two powers of 3/3 1 immediately tells us 
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that the Lagrangian can only have the form 1 C — \(d(p) 2 — V(up) with V some function of 
<p. For simplicity, we now restrict V to be a polynomial in <p, although much of the present 
discussion will not depend on this restriction. We will have a great deal more to say about 
symmetry later. Here we note that, for example, we could insist that physics is symmetric 
under q> -> —<p, in which case V(cp) would have to be an even polynomial. 

Now that you know what a quantum field theory is, you realize why I used the letter q 
to label the position of the particle in the previous chapter and not the more common x. 
In quantum field theory, x is a label, not a dynamical variable. The x appearing in (p{t, x) 
corresponds to the label a in q a (t ) in quantum mechanics. The dynamical variable in field 
theory is not position, but the field <p . The variable x simply specifies which field variable we 
are talking about. I belabor this point because upon first exposure to quantum field theory 
some students, used to thinking of x as a dynamical operator in quantum mechanics, are 
confused by its role here. 

In summary, we have the table 


<7 - 

> <p 



a - 

* X 

q a (t) - 

* (pit, x) = <p(x) 

E- 

a 

> / cl D x 


( 6 ) 


Thus we finally have the path integral defining a scalar field theory in d = (D + 1) dimen¬ 
sional spacetime: 

Z = J D(pe' f ddx( ^ (dlf)1 ~ VW) (7) 

Note that a (0 + 1)-dimensional quantum field theory is just quantum mechanics. 


The classical limit 


As I have already remarked, the path integral formalism is particularly convenient for 
taking the classical limit. Remembering that Planck’s constant h has the dimension of 
energy multiplied by time, we see that it appears in the unitary evolution operator 
Tracing through the derivation of the path integral, we see that we simply divide the overall 
factor i by h to get 


Z = J D<pe m) f d4xCW 


( 8 ) 


1 Strictly speaking, a term of the form U{(p){d(p) 2 is also possible. In quantum mechanics, a term such as 
U{q){dq/dt) 2 in the Lagrangian would describe a particle whose mass depends on position. We will not consider 
such “nasty” terms until much later. 
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In the limit H much smaller than the relevant action we are considering, we can evaluate the 
path integral using the stationary phase (or steepest descent) approximation, as I explained 
in the previous chapter in the context of quantum mechanics. We simply determine the 
extremum of f d 4 xC(<p). According to the usual Euler-Lagrange variational procedure, this 
leads to the equation 


^S(d^) 8<p 


( 9 ) 


We thus recover the classical field equation, exactly as we should, which in our scalar field 
theory reads 


(fh -f- m^)(p(x) + + — (p{x)^ + * * ■ — 0 

2 6 


( 10 ) 


The vacuum 

In the point particle quantum mechanics discussed in chapter 1.2 we wrote the path 
integral for (F\ e~' HT \/}, with some initial and final state, which we can choose at our 
pleasure. A convenient and particularly natural choice would be to take |7) = IT 7 } to be 
the ground state. In quantum field theory what should we choose for the initial and final 
states? A standard choice for the initial and final states is the ground state or the vacuum 
state of the system, denoted by |0), in which, speaking colloquially, nothing is happening. 
In other words, we would calculate the quantum transition amplitude from the vacuum to 
the vacuum, which would enable us to determine the energy of the ground state. But this 
is not a particularly interesting quantity, because in quantum field theory we would like to 
measure all energies relative to the vacuum and so, by convention, would set the energy 
of the vacuum to zero (possibly by having to subtract a constant from the Lagrangian). 
Incidentally, the vacuum in quantum field theory is a stormy sea of quantum fluctuations, 
but for this initial pass at quantum field theory, we will not examine it in any detail. We 
will certainly come back to the vacuum in later chapters. 


Disturbing the vacuum 

We might enjoy doing something more exciting than watching a boiling sea of quantum 
fluctuations. We might want to disturb the vacuum. Somewhere in space, at some instant 
in time, we would like to create a particle, watch it propagate for a while, and then annihilate 
it somewhere else in space, at some later instant in time. In other words, we want to set 
up a source and a sink (sometimes referred to collectively as sources) at which particles 
can be created and annihilated. 

To see how to do this, let us go back to the mattress. Bounce up and down on it to create 
some excitations. Obviously, pushing on the mass labeled by a in the mattress corresponds 
to adding a term such as J a (t)q a to the potential V(q\, qj, • • • , q^). More generally, 
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► 

X 


Figure 1.3.1 


we can add J2 a When we go to field theory this added term gets promoted to 

/ d D x J(x)cp(x) in the field theory Lagrangian, according to the promotion table (6). 

This so-called source function J(t,x) describes how the mattress is being disturbed. 
We can choose whatever function we like, corresponding to our freedom to push on the 
mattress wherever and whenever we like. In particular, J (x) can vanish everywhere in 
spacetime except in some localized regions. 

By bouncing up and down on the mattress we can get wave packets going off here and 
there (fig. 1.3.1). This corresponds precisely to sources (and sinks) for particles. Thus, we 
really want the path integral 

2 _ f Dyg' f d 4 x[i(d<p) 2 -V(<p)+J(.x)<p(x)] 


Free field theory 

The functional integral in (11) is impossible to do except when 

C(y) = \[{d V ) 2 - m 2 V 2 ] ( 12 ) 

The corresponding theory is called the free or Gaussian theory. The equation of motion 
(9) works out to be (3 2 + m 2 )(p = 0, known as the Klein-Gordon equation. 2 Being linear, it 
can be solved immediately to give <p(x, t) = e l(Mt ~ k ' x '> with 

co 2 = k 2 + m 2 (13) 

2 The Klein-Gordon equation was actually discovered by Schrodinger before he found the equation that now 
bears his name. Later, in 1926, it was written down independently by Klein, Gordon, Fock, Kudar, de Donder, 
and Van Dungen. 
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In the natural units we are using, H = 1 and so frequency a> is the same as energy Ha> 
and wave vector k is the same as momentum hk. Thus, we recognize (13) as the energy- 
momentum relation for a particle of mass m, namely the sophisticate’s version of the 
layperson’s E — me 2 . We expect this field theory to describe a relativistic particle of mass m. 


Let us now evaluate (11) in this special case: 

Z = j Dye'f 1 


(14) 


Integrating by parts under the / d 4 x and not worrying about the possible contribution of 
boundary terms at infinity (we implicitly assume that the fields we are integrating over fall 
off sufficiently rapidly), we write 


j Dye' f rf 4 h-i«s(a 2 +m 2 )«5+/t»] 


(15) 


You will encounter functional integrals like this again and again in your study of field 
theory. The trick is to imagine discretizing spacetime. You don’t actually have to do it: 
Just imagine doing it. Let me sketch how this goes. Replace the function y(x) by the 
vector q>i — y(ia) with i an integer and a the lattice spacing. (For simplicity, I am writing 
things as if we were in 1-dimensional spacetime. More generally, just let the index i 
enumerate the lattice points in some way.) Then differential operators become matrices. 
For example, dy(ia) -> (1/a) (< p i+1 — y/) = JT M rj <pj , with some appropriate matrix M. 
Integrals become sums. For example, / d 4 xJ(x)y(x) —> a 4 J2i JiVi- 
Now, lo and behold, the integral (15) is just the integral we did in (1.2.23) 

(*+oo r+oo 


/ +oo r+oo p -j-oo 

/ •••/ dq 1 dq 2 ---dq N e^- A ^ J - t > 

-oo J — oo J — oo 


/ (2ni) N 
V det[A] 


-<i 


) 


( i/2)J-A- l -J 


The role of A in (16) is played in (15) by the differential operator — (3 2 
equation for the inverse, A - A ~ 1 — T A /l_1 


(16) 


2 ). The defining 


A 1 = I or AjjAj^ = 8 ik , becomes in the continuum limit 


— (3 2 + m 2 )D(x — y) — & [4 \x — y) 


(17) 


We denote the continuum limit of AT^ by D(x — y) (which we know must be a function 
of x — y, and not of x and y separately, since no point in spacetime is special). Note that 
in going from the lattice to the continuum Kronecker is replaced by Dirac. It is very useful 
to be able to go back and forth mentally between the lattice and the continuum. 

Our final result is 


Z(J) = Ce~^ 22) // d ^ xd AyJ(x) D (x-y )J(y) =Q e ‘W(J) 


(18) 


with D(x) determined by solving (17). The overall factor C, which corresponds to the overall 
factor with the determinant in (16), does not depend on J and, as will become clear in the 
discussion to follow, is often of no interest to us. I will often omit writing C altogether. 
Clearly, C — Z(J — 0) so that W(J) is defined by 

Z(J ) = Z(J = 0)e iW(J> (19) 
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Observe that 




-Iff 


d 4 xd 4 yJ(x)D(x — y)J(y) 


( 20 ) 


is a simple quadratic functional of /. In contrast, Z(J) depends on arbitrarily high powers 
of J. This fact will be of importance in chapter 1.7. 


Free propagator 


The function D(x), known as the propagator, plays an essential role in quantum field 
theory. As the inverse of a differential operator it is clearly closely related to the Green’s 
function you encountered in a course on electromagnetism. 

Physicists are sloppy about mathematical rigor, but even so, they have to be careful once 
in a while to make sure that what they are doing actually makes sense. For the integral 
in (15) to converge for large tp we replace m 2 —> m 2 — is so that the integrand contains a 
factor e~ B J ‘ , where s is a positive infinitesimal we will let tend to zero. 3 

We can solve (17) easily by going to momentum space and multiplying together four 
copies of the representation (1.2.11) of the Dirac delta function 

S (4) (.r - y) = / 0^e‘ kix ~ y) (21) 

The solution is 


D(x -y) = 


/ 


d 4 k e ik(x ~y 1 

(2 jr ) 4 k 2 — m 2 + is 


( 22 ) 


which you can check by plugging into (17): 


— (3 2 + m 2 )D(.x — y) = 


-/ 


d 4 k k 2 — n 

(2n) 4 k 2 — m 2 + is 


Jk(x-y) 


f 


JHx-y) = 8 W {X _ } ag ^ (j 

(2 TV ) 4 


Note that the so-called is prescription we just mentioned is essential; otherwise the 
integral giving D(x) would hit a pole. The magnitude of e is not important as long as it is 
infinitesimal, but the positive sign of e is crucial as we will see presently. (More on this in 
chapter III.8.) Also, note that the sign of k in the exponential does not matter here by the 
symmetry k —> —k. 

To evaluate D(x) we first integrate over k° by the method of contours. Define a >* = 
+Vfe + m 2 with a plus sign. The integrand has two poles in the complex k° plane, at 
±, Ju > 2 — is, which in the e —>■ 0 limit are equal to +co k — is and —co k + is. Thus for s 
positive, one pole is in the lower half-plane and the other in the upper half plane, and so 
as we go along the real k° axis from —00 to +00 we do not run into the poles. The issue is 
how to close the integration contour. 

For x° positive, the factor e' k x is exponentially damped for k° in the upper half-plane. 
Hence we should extend the integration contour extending from —00 to +00 on the real 


3 As is customary, s is treated as generic, so that s multiplied by any positive number is still s. 
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axis to include the infinite semicircle in the upper half-plane, thus enclosing the pole at 
— co k + is and giving —i f i2 ^3 2ci , e~^ Wkt ~ k ' x \ Again, note that we are free to flip the sign 
of k. Also, as is conventional, we use .v 0 and t interchangeably. (In view of some reader 
confusion here in the first edition, I might add that I generally use x° with k° and t with 
co k ; k° is a variable that can take on either sign but a> k is a positive function of k.) 

For a' 0 negative, we do the opposite and close the contour in the lower half-plane, thus 
picking up the pole at +co k — is. We now obtain —i f (, d i k/(2n) i 2co k )e +l ^ 0)kt ~ k ' x \ 

Recall that the Heaviside (we will meet this great and aptly named physicist in chapter 
IV.4) step function 6 (?) is defined to be equal to 0 for ? < 0 and equal to 1 for ? > 0. As for 
what 0 (0) should be, the answer is that since we are proud physicists and not nitpicking 
mathematicians we will just wing it when the need arises. The step function allows us to 
package our two integration results together as 

D(x) = —i f — ^(23) 
J (27r) 2&>£ 

Physically, D(x) describes the amplitude for a disturbance in the field to propagate from 
the origin to x. Lorentz invariance tells us that it is a function of x 2 and the sign of.r 0 (since 
these are the quantities that do not change under a Lorentz transformation). We thus expect 
drastically different behavior depending on whether x is inside or outside the lightcone 
defined by x 2 = (x°) 2 — x 2 — 0. Without evaluating the d 2 k integral we can see roughly 
how things go. Let us look at some cases. 

In the future cone, x = (?, 0) with t > 0, D(x) — —i f (d i k/(2Tt) i 2xo k )e~ l< ° kt a superpo¬ 
sition of plane waves and thus D(x) oscillates. In the past cone, x = (?, 0) with t < 0, 
D(x ) = —i f (d 3 k/ (2jt) 3 2 a> k )e +,mk ' oscillates with the opposite phase. 

In contrast, for x spacelike rather than timelike, x° — 0, we have, upon interpret¬ 
ing 0(0) = j (the obvious choice; imagine smoothing out the step function), D(x) — 
-i f(d i k/(2n) i 2V¥ + m 2 )e l ^’ x . The square root cut starting at ±im tells us that the 
characteristic value of \k\ in the integral is of order m, leading to an exponential decay 
~ e -m bl, as we would expect. Classically, a particle cannot get outside the lightcone, but a 
quantum field can “leak” out over a distance of order ;« _1 by the Heisenberg uncertainty 
principle. 


Exercises 


1. 3 .1 

1.3.2 

1.3.3 


Verify that D(x ) decays exponentially for spacelike separation. 

Work out the propagator D(x ) for a free field theory in (1 + 1)-dimensional spacetime and study the large 
x 1 behavior for = 0. 


Show that the advanced propagator defined by 

d 4 k e ‘ k (.x-y) 


D adv(x - y) = 


/ 


(2jr) 4 k 2 — m 2 — i sgn (k 0 )e 
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(where the sign function is defined by sgn(& 0 ) = +1 if k Q > 0 and sgn(& 0 ) = — 1 if c 0) i s nonzero only 
if > y°. In other words, it only propagates into the future. [Hint: both poles of the integrand are now 
in the upper half of the A: 0 -plane.] Incidentally, some authors prefer to write (& 0 — ie) 2 — k 2 — m 2 instead 
of k 2 — m 2 — i sgn(ko)s in the integrand. Similarly, show that the retarded propagator 


r d 4 k e ik( - x -y) 

J (2;r) 4 k 2 — m 2 + i sgn(k 0 )£ 


propagates into the past. 




From Field to Particle to Force 


From field to particle 


In the previous chapter we obtained for the free theory 
W{J) =~^J J d 4 xd 4 yJ(x)D(x - y)J(y) 
which we now write in terms of the Fourier transform J(k) = J d 4 xe~ lkx J(x): 


W(J) = —- 


- U 


d 4 k 

(2n)‘ 


-J(kY 


k 2 — m 2 + is 


J (k) 


( 1 ) 


( 2 ) 


[Note that J(k)* = J (—k) for J(x) real.] 

We can jump up and down on the mattress any way we like. In other words, we can 
choose any J (x) we want, and by exploiting this freedom of choice, we can extract a 
remarkable amount of physics. 

Consider J(x) = J^x) + J 2 (x), where J\(x) and Jj{x) are concentrated in two local 
regions 1 and 2 in spacetime (fig. 1.4.1). Then W(J) contains four terms, of the form 2 * 2 1 , 
/ 2 */ 2 » J*Ji> an< i i- L et us focus on the last two of these terms, one of which reads 

1 C d 4 k 1 

W(J) = - 0 / hik) * Ti -2VT- h(k) (3) 

2 J (2: r) 4 k A — m^ + is 

We see that W(J) is large only if J\{x) and Jiix) overlap significantly in their Fourier 
transform and if in the region of overlap in momentum space k 2 — m 2 almost vanishes. 
There is a “resonance type” spike at k 2 = m 2 , that is, if the energy-momentum relation 
of a particle of mass m is satisfied. (We will use the language of the relativistic physicist, 
writing “momentum space” for energy-momentum space, and lapse into nonrelativistic 
language only when the context demands it, such as in “energy-momentum relation.”) 

We thus interpret the physics contained in our simple field theory as follows: In region 
1 in spacetime there exists a source that sends out a “disturbance in the field,” which 
is later absorbed by a sink in region 2 in spacetime. Experimentalists choose to call this 
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Figure 1.4.1 


disturbance in the field a particle of mass m. Our expectation based on the equation of 
motion that the theory contains a particle of mass m is fulfilled. 

A bit of jargon: When k 2 = m 2 , k is said to be on mass shell. Note, however, that in (3) 
we integrate over all k, including values of k far from the mass shell. For arbitrary k, it is 
a linguistic convenience to say that a “virtual particle” of momentum k propagates from 
the source to the sink. 


From particle to force 

We can now go on to consider other possibilities for J (x) (which we will refer to generically 
as sources), for example, J(x ) = /, (x) + J 2 (x), where J a (x) = S (2) (x — x a ). In other words, 
J (x) is a sum of sources that are time-independent infinitely sharp spikes located at X\ and 
x 2 in space. (If you like more mathematical rigor than is offered here, you are welcome to 
replace the delta function by lumpy functions peaking at x a . You would simply clutter up 
the formulas without gaining much.) More picturesquely, we are describing two massive 
lumps sitting at X\ and x 2 on the mattress and not moving at all [no time dependence in 

What do the quantum fluctuations in the field <p, that is, the vibrations in the mattress, 
do to the two lumps sitting on the mattress? If you expect an attraction between the two 
lumps, you are quite right. 

As before, W(J) contains four terms. We neglect the “self-interaction” term J 2 since 
this contribution would be present in W regardless of whether J 2 is present or not. We 
want to study the interaction between the two “massive lumps” represented by J 2 and J 2 . 
Similarly we neglect J 2 J 2 . 
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Plugging into (1) and doing the integral over d i x and d}y we immediately obtain 


W(J) = - 


ff dxV J 


dk ° Jk°(x- y)° 

2n 


/ 


d l k e 'Mh-*2) 
(27r) 3 k 2 — m 2 + ie 


( 4 ) 


(The factor 2 comes from the two terms J 2 J\ and J\J 2 .) Integrating over y° we get a delta 
function setting k° to zero (so that k is certainly not on mass shell, to throw the jargon 
around a bit). Thus we are left with 


/ r „\ r d i k 

W(J)=[ / d,Y°) / - (5) 

\J ) J (2tt) 3 k 2 + m 2 

Note that the infinitesimal is can be dropped since the denominator k 2 + m 2 is always 
positive. 

The factor (f dx°) should have filled us with fear and trepidation: an integral over time, 
it seems to be infinite. Fear not! Recall that in the path integral formalism Z — C e' W(J) 
represents (0| e~' HT |0) = e~' ET , where E is the energy due to the presence of the two 
sources acting on each other. The factor (f dx°) produces precisely the time interval T. All 
is well. Setting iW — —iET we obtain from (5) 


f d\ 

J (27r ) 3 k 2 + m 2 


( 6 ) 


The integral is evaluated in an appendix. This energy is negative! The presence of two delta 
function sources, at x 2 and x 2 , has lowered the energy. (Notice that for the two sources 
infinitely far apart, we have, as we might expect, E — 0: the infinitely rapidly oscillating 
exponential kills the integral.) In other words, two like objects attract each other by virtue 
of their coupling to the field (p . We have derived our first physical result in quantum field 
theory! 

We identify E as the potential energy between two static sources. Even without doing 
the integral, we see by dimensional analysis that the characteristic distance beyond which 
the integral goes to zero is given by the inverse of the characteristic value of k, which is 
m. Thus, we expect the attraction between the two sources to decrease rapidly to zero over 
the distance 1/m. 

The range of the attractive force generated by the field <p is determined inversely by the 
mass m of the particle described by the field. Got that? 

The integral is done in the appendix to this chapter and gives 

E =- —e~ mr (7) 

4jrr 

The result is as we expected: The potential drops off exponentially over the distance scale 
1/m. Obviously, dE/dr > 0: The two massive lumps sitting on the mattress can lower the 
energy by getting closer to each other. 

What we have derived was one of the most celebrated results in twentieth-century 
physics. Yukawa proposed that the attraction between nucleons in the atomic nucleus is 
due to their coupling to a field like the (p field described here. The known range of the 
nuclear force enabled him to predict not only the existence of the particle associated with 
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this field, now called the n meson 1 or the pion, but its mass as well. As you probably know, 
the pion was eventually discovered with essentially the properties predicted by Yukawa. 


Origin of force 

That the exchange of a particle can produce a force was one of the most profound concep¬ 
tual advances in physics. We now associate a particle with each of the known forces: for 
example, the photon with the electromagnetic force and the graviton with the gravitational 
force; the former is experimentally well established and while the latter has not yet been 
detected experimentally hardly anyone doubts its existence. We will discuss the photon and 
the graviton in the next chapter, but we can already answer a question smart high school 
students often ask: Why do Newton’s gravitational force and Coulomb’s electric force both 
obey the 1 /r 2 law? 

We see from (7) that if the mass m of the mediating particle vanishes, the force produced 
will obey the 1/r 2 law. If you trace back over our derivation, you will see that this comes 
from the fact that the Lagrangian density for the simplest field theory involves two powers 
of the spacetime derivative 3 (since any term involving one derivative such as cp dip is not 
Lorentz invariant). Indeed, the power dependence of the potential follows simply from 
dimensional analysis: / d 2 k(e lk ' x /k 2 ) ~ 1 /r. 

Connected versus disconnected 

We end with a couple of formal remarks of importance to us only in chapter 1.7. First, 
note that we might want to draw a small picture fig. (1.4.2) to represent the integrand 
J(x)D(x — y)J(y ) in W(J): A disturbance propagates from y to x (or vice versa). In fact, 
this is the beginning of Feynman diagrams! Second, recall that 

Z(3) = ZU = 0)£M 
n=0 

For instance, the n — 2 term in Z(J)/Z(J = 0) is given by 

(“0 //// rf4 V 4 r 2 ^ 4 v 3 d 4 v 4 £>(. x 1 -x 2 ) 

D(x 3 — x 4 )J(xj)J (x 2 ) J (x$) J (x 4 ) 

The integrand is graphically described in figure 1.4.3. The process is said to be discon¬ 
nected: The propagation from x 1 to x 2 and the propagation from x 3 to x 4 proceed inde¬ 
pendently. We will come back to the difference between connected and disconnected in 
chapter 1.7. 

1 The etymology behind this word is quite interesting (A. Zee, Fearful Symmetry: see pp. 169 and 335 to learn, 
among other things, the French objection and the connection between meson and illusion). 
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Appendix 


Writing x = (jcj — x 2 ) and u = cos 6 with 6 the angle between k and x, we evaluate the integral in (6) in spherical 
coordinates (with k= \k\ and r = |x|): 


—h- rw [ +1 du4^-=^- r*kk4 ^ 

(2n) 2 Jo J -1 k 2 + m 2 (2n) 2 ir Jo k 2 + m 2 


( 8 ) 


Since the integrand is even, we can extend the integral and write it as 


1 r 

2 J- o 


dk k 


sin kr 
k 2 + m 2 


_ 1 r 

~ 2, i_ c 


dk k- 


Jkr 


Since r is positive, we can close the contour in the upper half-plane and pick up the pole at +im, obtaining 
(l/2i){2ni){im/2im)e~ mr = (n/2)e~ mr . Thus, / = (l/4^r)^ _mr . 


Exercise 

1.4.1 Calculate the analog of the inverse square law in a (2 + 1)-dimensional universe, and more generally in 
a (D + 1)-dimensional universe. 




Coulomb and Newton: Repulsion and Attraction 


Why like charges repel 

We suggested that quantum field theory can explain both Newton’s gravitational force and 
Coulomb’s electric force naturally. Between like objects Newton’s force is attractive while 
Coulomb’s force is repulsive. Is quantum field theory “smart enough” to produce this 
observational fact, one of the most basic in our understanding of the physical universe? 
You bet! 

We will first treat the quantum field theory of the electromagnetic field, known as 
quantum electrodynamics or QED for short. In order to avoid complications at this stage 
associated with gauge invariance (about which much more later) I will consider instead 
the field theory of a massive spin 1 meson, or vector meson. After all, experimentally all we 
know is an upper bound on the photon mass, which although tiny is not mathematically 
zero. We can adopt a pragmatic attitude: Calculate with a photon mass m and set m = 0 at 
the end, and if the result does not blow up in our faces, we will presume that it is OK. 1 

Recall Maxwell’s Lagrangian for electromagnetism £ — — \F /1V F ,1V , where F^ v = 3 /( A y 
—3 v A fl with A^ix) the vector potential. You can see the reason for the important overall 
minus sign in the Lagrangian by looking at the coefficient of (3 0 A,) 2 , which has to be 
positive, just like the coefficient of (3 0 (p) 2 in the Lagrangian for the scalar field. This says 
simply that time variation should cost a positive amount of action. 

I will now give the photon a small mass by changing the Lagrangian to £ — — \F IXV F^ V 
+±m 2 A fl A li + A^J^. (The mass term is written in analogy to the mass term m 2 (p 2 in 
the scalar field Lagrangian; we will see shortly that the sign is correct and that this term 
indeed leads to a photon mass.) I have also added a source J ll (x ),which in this context is 
more familiarly known as a current. We will assume that the current is conserved so that 
3^ = 0. 

1 When I took a field theory course as a student with Sidney Coleman this was how he treated QED in order 
to avoid discussing gauge invariance. 
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Well, you know that the field theory of our vector meson is defined by the path integral 
Z = f DA = e ,w D) w jth the action 

S(A) = J d 4 xC = J d 4 x{\A lx [(d 1 + m 2 )g ,lv - d> x d v ]A v + A^J 11 } (1) 

The second equality follows upon integrating by parts [compare (1.3.15)]. 

By now you have learned that we simply apply (1.3.16). We merely have to find the inverse 
of the differential operator in the square bracket; in other words, we have to solve 

[O 2 + m 2 )^ v - d^d v ]D vX (x) = S^S (4 \x) (2) 


As before [compare (1.3.17)] we go to momentum space by defining 


Plugging in, we find that [—(k 2 — m 2 )g^ v + k^k v ]D vX {k) — 8ff, giving 


D vX (k) = 


~gy\ + k v kjm 2 
k 2 — m 2 


(3) 


This is the photon, or more accurately the massive vector meson, propagator. Thus 


WW—\ 


/ 


d 4 k 

w 4 


J^(k)* 


-g^v + k^kjm 2 jvn ^ 
k L — m A + is 


(4) 


Since current conservation d fl J fl (x)=0 gets translated into momentum space as 
= 0, we can throw away the term in the photon propagator. The effective 
action simplifies to 


W(J)=- 


-If 


d 4 k 


■J^ikf 


k 2 — m 2 + is 


Mk) 


(5) 


No further computation is needed to obtain a profound result. Just compare this result 
to (1.4.2). The field theory has produced an extra sign. The potential energy between two 
lumps of charge density J°(x) is positive. The electromagnetic force between like charges 
is repulsive! 

We can now safely let the photon mass m go to zero thanks to current conservation. 
[Note that we could not have done that in (3).] Indeed, referring to (1.4.7) we see that the 
potential energy between like charges is 



47rr 47rr 


( 6 ) 


To accommodate positive and negative charges we can simply write J 11 = J 1 * — Jff. We 
see that a lump with charge density 7® is attracted to a lump with charge density jff. 


Bypassing Maxwell 

Having done electromagnetism in two minutes flat let me now do gravity. Let us move on 
to the massive spin 2 meson field. In my treatment of the massive spin 1 meson field I 
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took a short cut. Assuming that you are familiar with the Maxwell Lagrangian, I simply 
added a mass term to it and took off. But I do not feel comfortable assuming that you are 
equally familiar with the corresponding Lagrangian for the massless spin 2 field (the so- 
called linearized Einstein Lagrangian, which I will discuss in a later chapter). So here I will 
follow another strategy. 

I invite you to think physically, and together we will arrive at the propagator for a massive 
spin 2 field. First, we will warm up with the massive spin 1 case. 

In fact, start with something even easier: the propagator D(k) — 1 /(k 2 — m 2 ) for a 
massive spin 0 field. It tells us that the amplitude for the propagation of a spin 0 disturbance 
blows up when the disturbance is almost a real particle. The residue of the pole is a property 
of the particle. The propagator for a spin 1 field D v j carries a pair of Lorentz indices and 
in fact we know what it is from (3): 


D vl (k) = 


-GyX 
k 2 — m 2 


(7) 


where for later convenience we have defined 


G vX (k) = ^ (8) 

m L 

Let us now understand the physics behind G vA . I expect you to remember the concept 
of polarization from your course on electromagnetism. A massive spin 1 particle has three 
degrees of polarization for the obvious reason that in its rest frame its spin vector can point 
in three different directions. The three polarization vectors s (a> are simply the three unit 
vectors pointing along the x, y, and z axes, respectively (a = 1, 2, 3): — (0, 1, 0, 0), 

s^ 2> = (0, 0, 1, 0), s ® = (0, 0, 0, 1). In the rest frame k 11 = (m, 0, 0, 0) and so 

/X /A 

Fcjf = 0 (9) 


Since this is a Lorentz invariant equation, it holds for a moving spin 1 particle as well. 
Indeed, with a suitable normalization condition this fixes the three polarization vectors 
e (a '(k ) for a particle with momentum k. 

The amplitude for a particle with momentum k and polarization a to be created at 
the source is proportional to s ( “-(k), and the amplitude for it to be absorbed at the sink 
is proportional to e^ a \k). We multiply the amplitudes together to get the amplitude for 
propagation from source to sink, and then sum over the three possible polarizations. 
Now we understand the residue of the pole in the spin 1 propagator D vX (k ): It represents 
Ha s i a) (k) £ x * (k) • To calculate this quantity, note that by Lorentz invariance it can only be 
a linear combination of g vX and k v k x . The condition k IJ 'S < i ‘ l> = 0 fixes it to be proportional 
to g vX — k v k x /m 2 . We evaluate the left-hand side for k at rest with v = X = 1, for instance, 
and fix the overall and all-crucial sign to be —1. Thus 

£ s^(k)4 a) (k) = —G vX (k) = - (g vk - k -^\ (10) 

a ^2 

We have thus constructed the propagator D vX (k) for a massive spin 1 particle, bypassing 
Maxwell (see appendix 1). 

Onward to spin 2! We want to similarly bypass Einstein. 
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Bypassing Einstein 

A massive spin 2 particle has 5 (2 ■ 2 + 1 = 5, remember?) degrees of polarization, char¬ 
acterized by the five polarization tensors (a = 1, 2, • • •, 5) symmetric in the indices /x 
and v satisfying 

k^ = 0 ( 11 ) 

and the tracelessness condition 

8^ = 0 ( 12 ) 

Let’s count as a check. A symmetric Lorentz tensor has 4-5/2=10 components. The four 
conditions in ( 11 ) and the single condition in ( 12 ) cut the number of components down to 
10 — 4—1 = 5, precisely the right number. (Just to throw some jargon around, remember 
how to construct irreducible group representations? If not, read appendix B.) We fix the 
normalization of e /xl) by setting the positive quantity (k)s^ 2 (k) = 1 . 

So, in analogy with the spin 1 case we now determine J2 a We have to 

construct this object out of g^ v and k or equivalently G^ v and k /x . This quantity must 
be a linear combination of terms such as G /tl ,G A(T , G^k^k^ , and so forth. Using (11) and 
(12) repeatedly (exercise 1.5.1) you will easily find that 

E e ^«4 a >) = {G^G va + G^G vk ) - | G^G ka ( 13 ) 

a 

The overall sign and proportionality constant are determined by evaluating both sides for 
k at rest (for /x = X = 1 and v — a — 2, for instance). 

Thus, we have determined the propagator for a massive spin 2 particle 

„ ( G /iA.G V(r + G^GyjJ - IG^Gxa 

D nv Xa(k) = - - 2 - 2 - ( 14 ) 


Why we fall 

We are now ready to understand one of the fundamental mysteries of the universe: Why 
masses attract. 

Recall from your courses on electromagnetism and special relativity that the energy or 
mass density out of which mass is composed is part of a stress-energy tensor . For our 
purposes, in fact, all you need to remember is that T^ v is a symmetric tensor and that 
the component T 00 is the energy density. If you don’t remember, I will give you a physical 
explanation in appendix 2. 

To couple to the stress-energy tensor, we need a tensor field (p ILV symmetric in its two 
indices. In other words, the Lagrangian of the world should contain a term like (p llv T^ v . 
This is in fact how we know that the graviton, the particle responsible for gravity, has spin 2, 
just as we know that the photon, the particle responsible for electromagnetism and hence 
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coupled to the current J 11 , has spin 1. In Einstein’s theory, which we will discuss in a later 
chapter, (p^ v is of course part of the metric tensor. 

Just as we pretended that the photon has a small mass to avoid having to discuss gauge 
invariance, we will pretend that the graviton has a small mass to avoid having to discuss 
general coordinate invariance . 2 Aha, we just found the propagator for a massive spin 2 
particle. So let’s put it to work. 

In precise analogy to (4) 




/ 


d A k 




-8^v + k^kjm 2 ^ 
k 2 — m 2 + is 


( 15 ) 


describing the interaction between two electromagnetic currents, the interaction between 
two lumps of stress energy is described by 


W(T) = 

_1 f d 4 k T/ j, v ...* (G llX G vcx + G lla G vX ) — lG llv G la la (16) 

2 J (2 tt) 4 k 2 -m 2 + ie 

From the conservation of energy and momentum d^T^^x) = 0 and hence 
k IJ T llv (k) — 0, we can replace G /tl , in (16) by g flv . (Here as is clear from the context g still 
denotes the flat spacetime metric of Minkowski, rather than the curved metric of Einstein.) 

Now comes the punchline. Look at the interaction between two lumps of energy density 
T 00 . We have from (16) that 

W{T) = -\ f ^-T°°(kr , 2 1 + VJ. T 00 (k) (17) 

2 J (2 ny k z — m z + is 


Comparing with (5) and using the well-known fact that (1 + 1 — |) > 0, we see that while 
like charges repel, masses attract. Trumpets, please! 


The universe 

It is difficult to overstate the importance (not to speak of the beauty) of what we have 
learned: The exchange of a spin 0 particle produces an attractive force, of a spin 1 particle 
a repulsive force, and of a spin 2 particle an attractive force, realized in the hadronic strong 
interaction, the electromagnetic interaction, and the gravitational interaction, respectively. 
The universal attraction of gravity produces an instability that drives the formation of 
structure in the early universe . 3 Denser regions become denser yet. The attractive nuclear 
force mediated by the spin 0 particle eventually ignites the stars. Furthermore, the attractive 
force between protons and neutrons mediated by the spin 0 particle is able to overcome 
the repulsive electric force between protons mediated by the spin 1 particle to form a 


2 For the moment, I ask you to ignore all subtleties and simply assume that in order to understand gravity it 
is kosher to let m —> 0.1 will give a precise discussion of Einstein’s theory of gravity in chapter VIII.1. 

3 A good place to read about gravitational instability and the formation of structure in the universe along the 

line sketched here is in A. Zee, Einstein's Universe (formerly known as An Old Man's Toy). 
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variety of nuclei without which the world would certainly be rather boring. The repulsion 
between likes and hence attraction between opposites generated by the spin 1 particle allow 
electrically neutral atoms to form. 

The world results from a subtle interplay among spin 0, 1, and 2. 

In this lightning tour of the universe, we did not mention the weak interaction. In fact, 
the weak interaction plays a crucial role in keeping stars such as our sun burning at a 
steady rate. 


Time differs from space by a sign 

This weaving together of fields, particles, and forces to produce a universe rich with 
possibilities is so beautiful that it is well worth pausing to examine the underlying physics 
some more. The expression in (1.4.1) describes the effect of our disturbing the vacuum 
(or the mattress!) with the source J , calculated to second order. Thus some readers may 
have recognized that the negative sign in (1.4.6) comes from the elementary quantum 
mechanical result that in second order perturbation theory the lowest energy state always 
has its energy pushed downward: for the ground state 

all the energy denominators have the same sign. 

In essence, this “theorem” follows from the property of 2 by 2 matrices. Let us set the 
ground state energy to 0 and crudely represent the entirety of the other states by a single 
state with energy w > 0. Then the Hamiltonian including the perturbation v effective to 
second order is given by 



Since the determinant of H (and hence the product of the two eigenvalues) is manifestly 

negative, the ground state energy is pushed below 0. [More explicitly, we calculate the 

eigenvalue s with the characteristic equation 0 = e(e — w) — v 2 — (we + v 2 ), and hence 
2 

e sa — —.] In different fields of physics, this phenomenon is variously known as level 
repulsion or the seesaw mechanism (see chapter VII.7). 

Disturbing the vacuum with a source lowers its energy. Thus it is easy to understand 
that generically the exchange of a particles leads to an attractive force. 

But then why does the exchange of a spin 1 particle produces a repulsion between like 
objects? The secret lies in the profundity that space differs from time by a sign, namely, 
that g 00 = +1 while ga = — 1 for i = 1, 2, 3. In (10), the left-hand side is manifestly positive 
for v — X = i. Talcing k to be at rest we understand the minus sign in (10) and hence in (4). 
Roughly speaking, for spin 2 exchange the sign occurs twice in (16). 


Degrees of freedom 

Now for a bit of cold water: Logically and mathematically the physics of a particle with mass 
m 0 could well be different from the physics with m = 0. Indeed, we know from classical 
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electromagnetism that an electromagnetic wave has 2 polarizations, that is, 2 degrees of 
freedom. For a massive spin 1 particle we can go to its rest frame, where the rotation group 
tells us that there are 2 • 1 + 1 = 3 degrees of freedom. The crucial piece of physics is that we 
can never bring the massless photon to its rest frame. Mathematically, the rotation group 
S O (3) degenerates into SO( 2), the group of 2-dimensional rotations around the direction 
of the photon’s momentum. 

We will see in chapter II.7 that the longitudinal degree of freedom of a massive spin 1 
meson decouples as we take the mass to zero. The treatment given here for the interaction 
between charges (6) is correct. However, in the case of gravity, the § in (17) is replaced by 
1 in Einstein’s theory, as we will see chapter VIII.1. Fortunately, the sign of the interaction 
given in (17) does not change. Mute the trumpets a bit. 


Appendix l 


Pretend that we never heard of the Maxwell Lagrangian. We want to construct a relativistic Lagrangian for a 
massive spin 1 meson field. Together we will discover Maxwell. Spin 1 means that the field transforms as a vector 
under the 3-dimensional rotation group. The simplest Lorentz object that contains the 3-dimensional vector is 
obviously the 4-dimensional vector. Thus, we start with a vector field A^(jc). 

That the vector field carries mass m means that it satisfies the field equation 

(9 2 + m 2 )A M = 0 (18) 

A spin 1 particle has 3 degrees of freedom [remember, in fancy language, the representation j of the rotation 
group has dimension (2 j + 1); here j = 1.] On the other hand, the field A^(x) contains 4 components. Thus, we 
must impose a constraint to cut down the number of degrees of freedom from 4 to 3. The only Lorentz covariant 
possibility (linear in A is 

9^ = 0 (19) 

It may also be helpful to look at (18) and (19) in momentum space, where they read ( k 2 — m 2 )A^(k) = 0 and 
k^A^ik) = 0. The first equation tells us that k 2 = m 2 and the second that if we go to the rest frame k 11 = {m, 0) 
then A 0 vanishes, leaving us with 3 nonzero components A' with i = 1, 2, 3. 

The remarkable observation is that we can combine (18) and (19) into a single equation, namely 

(g' lv 9 2 - d IJ -d v )A v + m 2 A 11 = 0 (20) 

Verify that (20) contains both (18) and (19). Act with 3^, on (20). We obtain m 2 3 (1 A ,i = 0, which implies that 
d^A^ = 0 . (At this step it is crucial that m ^ 0 and that we are not talking about the strictly massless photon.) 
We have thus obtained (19 ); using (19) in (20) we recover (18). 

We can now construct a Lagrangian by multiplying the left-hand side of (20) by +\ A^ (the \ is conventional 
but the plus sign is fixed by physics, namely the requirement of positive kinetic energy); thus 

£ = 1 A M [0 2 + m 2 )g lxv - 9^3 V ]A V (21) 

Integrating by parts, we recognize this as the massive version of the Maxwell Lagrangian. In the limit m —> 0 we 
recover Maxwell. 

A word about terminology: Some people insist on calling only F^ a field and A^ a potential. Conforming to 
common usage, we will not make this fine distinction. For us, any dynamical function of spacetime is a field. 
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Appendix 2: Why does the graviton have spin 2? 


First we have to understand why the photon has spin 1. Think physically. Consider a bunch of electrons at 
rest inside a small box. An observer moving by sees the box Lorentz-Fitzgerald contracted and thus a higher 
charge density than the observer at rest relative to the box. Thus charge density J°(x) transforms like the time 
component of a 4-vector density J^{x). In other words, J'° = 7°/V 1 — v 2 . The photon couples to J^{x) and has 
to be described by a 4-vector field A^(x) for the Lorentz indices to match. 

What about energy density? The observer at rest relative to the box sees each electron contributing m to the 
energy enclosed in the box. The moving observer, on the other hand, sees the electrons moving and thus each 
having an energy ra/V 1 — v 2 . With the contracted volume and the enhanced energy, the energy density gets 
enhanced by two factors of l/Vl — v 2 , that is, it transforms like the T 00 component of a 2-indexed tensor 7 ,/ * v . 
The graviton couples to T ,xv {x ) and has to be described by a 2-indexed tensor field (p^ix) for the Lorentz indices 
to match. 


Exercise 

1.5.1 Write down the most general form for ( k ) e ^ ( k ) using symmetry repeatedly. For example, it must 

be invariant under the exchange {/zv Xcr). You might end up with something like 

AG^Gxo + B{G^G va + G^ a G vX ) + C(G tlv k } Jc 0 + k^G^a) 

+ B)(k^k^G va + k^kfjGvx + k v k a G ^ + k v k x G^ a ) + Ek^k v k x k 0 (22) 

with various unknown A, • • • , E. Apply k^ ejJJ( k)s ^( k ) = 0 and find out what that implies for the 

constants. Proceeding in this way, derive (13). 




Inverse Square Law and the Floating 3-Brane 


Why inverse square? 

In your first encounter with physics, didn’t you wonder why an inverse square force law 
and not, say, an inverse cube law? In chapter 1.4 you learned the deep answer. When a 
massless particle is exchanged between two particles, the potential energy between the 
two particles goes as 

V(r)oc f oc - (1) 

J k 1 r 

The spin of the exchanged particle controls the overall sign, but the 1/r follows just from 
dimensional analysis, as I remarked earlier. Basically, V ( r ) is the Fourier transform of the 
propagator. The k 2 in the propagator comes from the (3 f (p ■ dj(p) term in the action, where 
<p denotes generically the field associated with the massless particle being exchanged, and 
the (3 j<p ■ dj(p ) form is required by rotational invariance. It couldn’t be k or P in (1); P is 
the simplest possibility. So you can say that in some sense ultimately the inverse square 
law comes from rotational invariance! 

Physically, the inverse square law goes back to Faraday’s flux picture. Consider a sphere 
of radius r surrounding a charge. The electric flux per unit area going through the sphere 
varies as l/4jrP. This geometric fact is reflected in the factor d 3 k in (1). 


Brane world 

Remarkably, with the tiny bit of quantum field theory I have exposed you to, I can already 
take you to the frontier of current research, current as of the writing of this book. In string 
theory, our (3 + 1)-dimensional world could well be embedded in a larger universe, the 
way a (2 + 1 (-dimensional sheet of paper is embedded in our everyday (3 + 1 (-dimensional 
world. We are said to be living on a 3 brane. 

So suppose there are n extra dimensions, with coordinates x 4 , x 5 , • • • , x n+3 . Let the 
characteristic scales associated with these extra coordinates be R. I can’t go into the 
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different detailed scenarios describing what R is precisely. For some reason I can’t go into 
either, we are stuck on the 3 brane. In contrast, the graviton is associated intrinsically with 
the structure of spacetime and so roams throughout the (« + 3 + 1 (-dimensional universe. 

All right, what is the gravitational force law between two particles? It is surely not your 
grandfather’s gravitational force law: We Fourier transform 

( 2 ) 

to obtain a 1 /r 1+n law. 

Doesn’t this immediately contradict observation? 

Well, no, because Newton’s law continues to hold for r R. In this regime, the extra 
coordinates are effectively zero compared to the characteristic length scale r we are inter¬ 
ested in. The flux cannot spread far in the direction of the n extra coordinates. Think of 
the flux being forced to spread in only the three spatial directions we know, just as the 
electromagnetic field in a wave guide is forced to propagate down the tube. Effectively we 
are back in (3 + 1)-dimensional spacetime and V(r) reverts to a 1 /r dependence. 

The new law of gravity (2) holds only in the opposite regime r <£ R. Heuristically, when 
R is much larger than the separation between the two particles, the flux does not know 
that the extra coordinates are finite in extent and thinks that it lives in an (n + 3+1)- 
dimensional universe. 

Because of the weakness of gravity, Newton’s force law has not been tested to much 
accuracy at laboratory distance scales, and so there is plenty of room for theorists to 
speculate in: R could easily be much larger than the scale of elementary particles and yet 
much smaller than the scale of everyday phenomena. Incredibly, the universe could have 
“large extra dimensions”! (The word “large” means large on the scale of particle physics.) 


Planck mass 

To be quantitative, let us define the Planck mass M P] by writing Newton’s law more 
rationally as V(r) — G^mp^fl/r) = (m 1 m 2 /Mp j )(l/r). Numerically, M Pl ~ 10 19 Gev. 
This enormous value obviously reflects the weakness of gravity. 

In fundamental units in which h and c are set to unity, gravity defines an intrinsic mass 
or energy scale much higher than any scale we have yet explored experimentally. Indeed, 
one of the fundamental mysteries of contemporary particle physics is why this mass scale 
is so high compared to anything else we know of. I will come back to this so-called hierarchy 
problem in due time. For the moment, let us ask if this new picture of gravity, new in the 
waning moments of the last century, can alleviate the hierarchy problem by lowering the 
intrinsic mass scale of gravity. 

Denote the mass scale (the “true scale” of gravity) characteristic of gravity in the (n + 3 
+l)-dimensional universe by M TG so that the gravitational potential between two objects 
of masses m 1 and m 2 separated by a distance r <$C R is given by 

V(r) = _ i- 

[M TG ] 2+n r 1+n 



42 | I. Motivation and Foundation 


Note that the dependence on M TG follows from dimensional analysis: two powers to cancel 
m 1 m 2 and n powers to match the n extra powers of 1/r. For r > S, aswe have argued, the 
geometric spread of the gravitational flux is cut off by R so that the potential becomes 


V(r) = 


m-\ni2 1 1 
[M tg ] 2+ " R n r 


Comparing with the observed law V ( r ) = (m l m 2 /M 2 p] ){\/r) we obtain 


M tg — 


M 2 
m PI 


( 3 ) 


[M TC Rf 

If M tg R could be made large enough, we have the intriguing possibility that the funda¬ 
mental scale of gravity M TG may be much lower than what we have always thought. 

Accelerators (such as the large Hadron Collider) could offer an exciting verification of 
this intriguing possibility. If the true scale of gravity M TG lies in an energy range accessible 
to the accelerator, there may be copious production of gravitons escaping into the higher 
dimensional universe. Experimentalists would see a massive amount of missing energy. 


Exercise 

1.6.1 Putting in the numbers, show that the case n = 1 is already ruled out. 




Feynman Diagrams 


Feynman brought quantum field theory to the masses. 

—J. Schwinger 


Anharmonicity in field theory 

The free field theory we studied in the last few chapters was easy to solve because the defin¬ 
ing path integral (1.3.14) is Gaussian, so we could simply apply (1.2.15). (This corresponds 
to solving the harmonic oscillator in quantum mechanics.) As I noted in chapter 1.3, within 
the harmonic approximation the vibrational modes on the mattress can be linearly super¬ 
posed and thus they simply pass through each other. The particles represented by wave 
packets constructed out of these modes do not interact: 1 hence the term free field theory. 
To have the modes scatter off each other we have to include anharmonic terms in the La- 
grangian so that the equation of motion is no longer linear. For the sake of simplicity let 
us add only one anharmonic term — (p 4 to our free field theory and, recalling (1.3.11), try 
to evaluate 

Z(J) = J Dip e' f d * x{ 5[W 2 -"* V]- %v a + j <p) 

(We suppress the dependence of Z on A.) 

Doing quantum field theory is no sweat, you say, it just amounts to doing the functional 
integral (1). But the integral is not easy! If you could do it, it would be big news. 


Feynman diagrams made easy 

As an undergraduate, I heard of these mysterious little pictures called Feynman diagrams 
and really wanted to learn about them. I am sure that you too have wondered about those 

1 A potential source of confusion: Thanks to the propagation of <p , the sources coupled to ip interact, as was 
seen in chapter 1.4, but the particles associated with <p do not interact with each other. This is like saying that 
charged particles coupled to the photon interact, but (to leading approximation) photons do not interact with 
each other. 
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funny diagrams. Well, I want to show you that Feynman diagrams are not such a big deal: 
Indeed we have already drawn little spacetime pictures in chapters 1.3 and 1.4 showing 
how particles can appear, propagate, and disappear. 

Feynman diagrams have long posed somewhat of an obstacle for first-time learners 
of quantum field theory. To derive Feynman diagrams, traditional texts typically adopt the 
canonical formalism (which I will introduce in the next chapter) instead of the path integral 
formalism used here. As we will see, in the canonical formalism fields appear as quantum 
operators. To derive Feynman diagrams, we would have to solve the equation of motion 
of the field operators perturbatively in A. A formidable amount of machinery has to be 
developed. 

In the opinion of those who prefer the path integral, the path integral formalism 
derivation is considerably simpler (naturally!). Nevertheless, the derivation can still get 
rather involved and the student could easily lose sight of the forest for the trees. There is 
no getting around the fact that you would have to put in some effort. 

I will try to make it as easy as possible for you. I have hit upon the great pedagogical 
device of letting you discover the Feynman diagrams for yourself. My strategy is to let you 
tackle two problems of increasing difficulty, what I call the baby problem and the child 
problem. By the time you get through these, the problem of evaluating (1) will seem much 
more tractable. 


A baby problem 


The baby problem is to evaluate the ordinary integral 

/ +°° 1 9 ? A. 4 

dqe~ 2 m q “?!« +Jq 

-OO 


( 2 ) 


evidently a much simpler version of (1). 

First, a trivial point: we can always scale q —»■ q/m so that Z = m~^ T (^, £), but we 
won’t. 

For k — 0 this is just one of the Gaussian integrals done in the appendix of chapter 1.2. 
Well, you say, it is easy enough to calculate Z(J) as a series in k : expand 


Z(J) = 


-£ 


dqe 


— ±m 2 q 2 +Jq 


1 -^ 4+ ^ ) V + ' 


and integrate term by term. You probably even know one of several tricks for computing 
dqe~ 2 m q +Jq q 4n : you write it as (jj) 4n dqe~ 2 m q +Jq and refer to (1.2.19). So 


Z(J) = (1 - +■■■) [ + °° dqe-> 2q2+Jq 

4! dJ 2 4! dJ J -oo 

f 00 d q e-\” 1 * +Jq = &)U-*We& J ' 

J-oo m 2 


— e 4 \ K dJ > 


( 3 ) 

( 4 ) 


(There are other tricks, such as differentiating dqe 2 "' q +Jq with respect to m 2 
repeatedly, but I want to discuss a trick that will also work for field theory.) By expanding 
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Figure 1.7.1 
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the two exponentials we can obtain any term in a double series expansion of Z(J) in X and 
J. [We will often suppress the overall factor (2jt/m 2 )2 — Z(J = 0, X — 0) = Z(0, 0) since it 
will be common to all terms. When we want to be precise, we will define Z = Z(J)/Z{ 0, 0).] 

For example, suppose we want the term of order X and J 4 in Z. We extract the order 
/ 8 term in e^ 2 / 2 '" 2 , namely, [l/4!(2m 2 ) 4 ]/ 8 , replace e -(V4!)W/dy) + by — (X/4\)(d/dJ) 4 , 
and differentiate to get [8!(—A)/(4!) 3 (2«i 2 ) 4 ]/ 4 . Another example: the term of order X 2 
and / 4 is [12!(—A.) 2 /(4!) 3 6!2(2/77 2 ) 6 ]y 4 . A third example: the term of order X 2 and / 6 is 
j(X/4\) 2 (d/d J) 8 [l/7\(2m 2 ) 7 ]J u — [14!(—1) 2 /(4!) 2 6!7!2(2;m 2 ) 7 ]/ 6 . Finally, the term of order 
X and J° is [l/2(2m 2 ) 2 ] (—A.). 

You can do this as well as I can! Do a few more and you will soon see a pattern. In 
fact, you will eventually realize that you can associate diagrams with each term and codify 
some rules. Our four examples are associated with the diagrams in figures 1.7.1-1.7.4. 
You can see, for a reason you will soon understand, that each term can be associated with 
several diagrams. I leave you to work out the rules carefully to get the numerical factors 
right (but trust me, the “future of democracy” is not going to depend on them). The rules 
go something like this: (1) diagrams are made of lines and vertices at which four lines 
meet; (2) for each vertex assign a factor of (—A); (3) for each line assign 1/m 2 ; and (4) for 
each external end assign J (e.g., figure 1.7.3 has seven lines, two vertices, and six ends, 
giving ~ [(—A) 2 /(w 2 ) 7 ]/ 6 .) (Did you notice that twice the number of lines is equal to four 
times the number of vertices plus the number of ends ? We will meet relations like that in 
chapter III.2.) 

In addition to the two diagrams shown in figure 1.7.3, there are ten diagrams obtained 
by adding an unconnected straight line to each of the ten diagrams in figure 1.7.2. (Do you 
understand why?) 

For obvious reasons, some diagrams (e.g., figure 1.7. la, 1.7.3a) are known as tree 2 
diagrams and others (e.g., Figs. 1.7.lb and 1.7.2a) as loop diagrams. 

Do as many examples as you need until you feel thoroughly familiar with what is going 
on, because we are going to do exactly the same thing in quantum field theory. It will 
look much messier, but only superficially. Be sure you understand how to use diagrams to 

2 The Chinese character for tree (A. Zee, Swallowing Clouds ) is shown in fig. 1.7.5. I leave it to you to figure 
out why this diagram does not appear in our Z(J). 
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Figure 17.5 


represent the double series expansion of Z(J) before reading on. Please. In my experience 
teaching, students who have not thoroughly understood the expansion of Z(J) have no 
hope of understanding what we are going to do in the field theory context. 


Wick contraction 


It is more obvious than obvious that we can expand Z( J) in powers of J, if we please, 
instead of in powers of X. As you will see, particle physicists like to classify in power of J. 
In our baby problem, we can write 

Z(J) = Y'-J S / rf< ? e-2 mV - (A ' /4!) «y = Z(0, 0) V -J S G (S) (5) 

t'o * ! J ~°° fo 


The coefficient G s> , whose analogs are known as “Green’s functions” in field theory, can 
be evaluated as a series in X with each term determined by Wick contraction (1.2.10). For 
instance, the O(X) term in G (4) is 


—X 

4!Z(0, 0) 



dq 


e~T- mq \ 


-7!! 1 
4! m s 


which of course better be equal 3 to what we obtained above for the XJ 4 term in Z. Thus, 
there are two ways of computing Z : you expand in X first or you expand in J first. 


Connected versus disconnected 

You will have noticed that some Feynman diagrams are connected and others are not. 
Thus, figure 1.7.3a is connected while 3b is not. I presaged this at the end of chapter 1.4 
and in figures 1.4.2 and 1.4.3. Write 

00 1 

Z(J, X) = Z(J = 0, X)e w(J ’ X) = Z(J =0,X)J2 —[1 V(j, X)f (6) 

N =0 

By definition, Z(J — 0, k) consists of those diagrams with no external source J, such as 
the one in figure 1.7.4. The statement is that W is a sum of connected diagrams while 


3 As a check on the laws of arithmetic we verify that indeed 7!!/(4!) 2 = 8!/(4!) 3 2 4 . 
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Z contains connected as well as disconnected diagrams. Thus, figure 1.7.3b consists of 
two disconnected pieces and comes from the term (1/2 \)[W (J, A)] 2 in (6), the 2! taking 
into account that it does not matter which of the two pieces you put “on the left or on the 
right.” Similarly, figure I.7.2i comes from (1/3!)[Vk(7, A)] 3 . Thus, it is W that we want to 
calculate, not Z. If you’ve had a good course on statistical mechanics, you will recognize 
that this business of connected graphs versus disconnected graphs is just what underlies 
the relation between free energy and the partition function. 


Propagation: from here to there 


All these features of the baby problem are structurally the same as the corresponding 
features of field theory and we can take over the discussion almost immediately. But before 
we graduate to field theory, let us consider what I call a child problem, the evaluation of a 
multiple integral instead of a single integral: 


/ -t-OO Z’+OO p+oo 

I ■ ■ ■ I dc h dq 2 ■ ■ ■ dq N e~ • A '«- (x/4!) « +J - q 

-OO J — OO J — OO 


(?) 


with q A = Rj- Generalizing the steps leading to (3) we obtain 


Z(J) = 


(2tt) a 


g -(X/4!) E.O/SAdg^-A- 1 -./ 


. det[A] J 

Alternatively, just as in (5) we can expand in powers of J 

oo N N 


W = E E • • ■ E ^ • 4 , / 00 rH 

s=0 i 1= l i s =l S - \ 1 / 

oo N N 

= z(0,0)£xy..£-v--v?«, j 


• • • ft. 


5=0 i 1=1 L=1 


( 8 ) 


(9) 


which again we can expand in powers of A and evaluate by Wick contracting. 

The one feature the child problem has that the baby problem doesn’t is propagation 
“from here to there”. Recall the discussion of the propagator in chapter 1.3. Just as in 
(1.3.16) we can think of the index i as labeling the sites on a lattice. Indeed, in (1.3.16) we 
had in effect evaluated the “2-point Green’s function” G ) to zeroth order in A (differentiate 
(1.3.16) with respect to / twice): 


G?)( A = 0) = 



/Z(0, 0) = (A-% 


(see also the appendix to chapter 1.2). The matrix element (A -1 ) iy - describes propagation 
from i to j. In the baby problem, each term in the expansion of Z(J) can be associated 
with several diagrams but that is no longer true with propagation. 



1.7. Feynman Diagrams | 49 


Let us now evaluate the “4-point Green’s function” to order X : 

d Vmj e~^' A ' q q i q j q k qi 

= (A-^yCA- 1 )*, + (A _1 ),^(A _1 ) ji + (A-VCA- 1 )^ 

- A ^(A-^faCA-^^CA-^CA- 1 ),, + • • • + 0(L 2 ) (10) 

n 

The first three terms describe one excitation propagating from i to j and another propa¬ 
gating from k to I, plus the two possible permutations on this “history.” The order X term 
tells us that four excitations, propagating from i to n, from j to n , from k to n , and from / 
to n , meet at n and interact with an amplitude proportional to X , where n is anywhere on 
the lattice or mattress. By the way, you also see why it is convenient to define the interac¬ 
tion (A./4!)</) 4 with a 1/4!: q, has a choice of four q n ’s to contract with, q . has three q n ’s to 
contract with, and so on, producing a factor of 4! to cancel the (1/4!). 

I intentionally did not display in (10) the O(X) terms produced by Wick contracting some 
of the q„’ s with each other. There are two types: (I) Contracting a pair of q n ’s produces 
something like A.(A _1 ) ( ;(A _1 ) in (A _1 ); n (A _1 )„„ and (II) contracting the q„’s with each 
other produces the first three terms in (10) multiplied by (A~ 1 ) nn (A~ 1 ) nn . We see that 
(I) and (II) correspond to diagrams b and c in figure 1.7.1, respectively. Evidently, the two 
excitations do not interact with each other. I will come back to (II) later in this chapter. 



1 - ^ + oa2) 


/Z( 0,0) 


Perturbative field theory 


You should now be ready for field theory! 

Indeed, the functional integral in (1) (which I repeat here) 


Z(J) = 


f 


Dtp e l / 2 [ <a v’> 2 - m VMV4!)<p 4 +.M 


( 11 ) 


has the same form as the ordinary integral in (2) and the multiple integral in (7). There 
is one minor difference: there is no i in (2) and (7), but as I noted in chapter 1.2 we can 
Wick rotate (11) and get rid of the i, but we won’t. The significant difference is that J and 
<p in (11) are functions of a continuous variable x, while J and q in (2) are not functions of 
anything and in (7) are functions of a discrete variable. Aside from that, everything goes 
through the same way. 

As in (3) and (8) we have 


Z(7) = e -«/4 KfSwiWJW? I D(pe i 

= Z(0, 0)e ~ (i/m / 


f Tx{ 1 [i<Up) 2 —m l <p 2 ]+Ji{P 

If d**d*yJ(x)IHx-y)J(y) 


( 12 ) 


The structural similarity is total. 

The role of 1/m 2 in (3) and of A -1 (8) is now played by the propagator 


D(x -y) = 


/ 


d A k e ik < x -y ) 


(2jr ) 4 k 2 — in 2 + is 
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Incidentally, if you go back to chapter 1.3 you will see that if we were in d-dimensional 
spacetime, D(x — y ) would be given by the same expression with d A k/(2it) A replaced by 
d d k/(2n) d . The ordinary integral (2) is like a field theory in O-dimensional spacetime: if 
we set d = 0, there is no propagating around and D(x — y) collapses to —1/m 2 . You see 
that it all makes sense. 

We also know that J(x) corresponds to sources and sinks. Thus, if we expand Z(J ) 
as a series in J , the powers of J would indicate the number of particles involved in the 
process. (Note that in this nomenclature the scattering process q> + (p -> <p + <p counts as a 
4-particle process: we count the total number of incoming and outgoing particles.) Thus, 
in particle physics it often makes sense to specify the power of J. Exactly as in the baby 
and child problems, we can expand in J first: 


Z(J) = 


00 -s r 
5=0 J 


— I dx-t ... dx s J(. *i) • • • J(x s )G {s \x i, • • •, x s ) 


00 -s r r 

= E / <**!• • -dx.Hx!) • • • J(x s ) j Dtpe'f^ [W 2 -"VHV4!V 4 ) 


5=0 

¥>(*i) ■ • • <P(x s ) 


(13) 


In particular, we have the 2-point Green’s function 

G{xi, x 2 ) = f Dtp e‘ f (14 ) 

the 4-point Green’s function, 

G(x v x 2 , x 3 , x 4 ) = 1 f Dipe 1 f d f l-< A -/ 4! )v> ] ( p(x 1 )(p(x 2 )<p(x 3 )ip(x 4 ) (15) 

ZL/ ^V/ j U J J 

and so on. [Sometimes Z(J ) is called the generating functional as it generates the Green’s 
functions.] Obviously, by translation invariance, G(x 1 , x 2 ) does not depend on x 3 and x 2 
separately, but only on x 2 — x 2 . Similarly, G(x ly x 2 , x } , x 4 ) only depends onxj — x 4 , x 2 — x 4 , 
and x 3 — x 4 . For X = 0, G(a' 1; x 2 ) reduces to iD(x 3 — x 2 ), the propagator introduced in 
chapter 1.3. While D(x 3 — x 2 ) describes the propagation of a particle between x 4 and x 2 in 
the absence of interaction, G(x 3 — x 2 ) describes the propagation of a particle between x 3 
and x 2 in the presence of interaction. If you understood our discussion of G you would 
know that Gfxj, x 2 , x 3 , x 4 ) describes the scattering of particles. 

In some sense, there are two ways of doing field theory, what I might call the Schwinger 
way (12) or the Wick way (13). 

Thus, to summarize, Feynman diagrams are just an extremely convenient way of rep¬ 
resenting the terms in a double series expansion of Z(J) in X and J. 

As I said in the preface, I have no intention of turning you into a whiz at calculating 
Feynman diagrams. In any case, that can only come with practice. Instead, I tried to give 
you as clear an account as I can muster of the concept behind this marvellous invention of 
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X 

Figure 1.7.6 


Feynman’s, which as Schwinger noted rather bitterly, enables almost anybody to become 
a field theorist. For the moment, don’t worry too much about factors of 4! and 2! 

Collision between particles 

As I already mentioned, I described in chapter 1.4 the strategy of setting up sources and 
sinks to watch the propagation of a particle (which I will call a meson) associated with 
the field (p. Let us now set up two sources and two sinks to watch two mesons scatter off 
each other. The setup is shown in figure 1.7.6. The sources localized in regions 1 and 2 
both produce a meson, and the two mesons eventually disappear into the sinks localized 
in regions 3 and 4. It clearly suffices to find in Z a term containing J (x^)J (x 2 ) J(x$)J (x 4 ). 
But this is just G(x 1: x 2 , x 2 , x 4 ). 

Let us be content with first order in X. Going the Wick way we have to evaluate 

—[ d 4 w [ D<p e l /^* { iK9»' )2 -*Vd 

Z(0,0)\ 4 \)J J 

<P(x 1 )<p(x 2 )<p(.Xt;)<p(x 4 )<p(w) 4 (16) 

Just as in (10) we Wick contract and obtain 

(—iX) J d 4 wD(x j — w)D(x 2 — w)D(x i — w)D(x 4 — w) (17) 

As a check, let us also derive this the Schwinger way. Replace g - 6/4!H / d w(s/sj(w)) ^ 
— (i'/4!)A. / d 4 W(8/SJ(w)) 4 and e~ (i,2) If d 4 xd*yJ(x)D(x-y)J(y) by 

4 ^ /f dixdAyJ< - x ' ,D(x ~ 
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Figure 1.7.7 


To save writing, it would be sagacious to introduce the abbreviations J a for J(x a ), f for 
/ d 4 x a , and D ab for D(x a — x b ). Dropping overall numerical factors, which I invite you to 
fill in, we obtain 

~ J (-) 4 /fJJJJJJ D ae D b fD cg D dh J a J b J c J d J e JfJ g J h (18) 

The four (8/8 J w )’ s hit the eight J’s in all possible combinations producing many terms, 
which again I invite you to write out. Two of the three terms are disconnected. The 
connected term is 


- iX 


UJff 


Da w D bw D cw D d w J a J b J c J d 


(19) 


Evidently, this comes from the four (8/8J w )’s hitting J e , Jf, J g , and J b , thus setting 
x e , Xf, x g , and x h to w. Compare (19) with [8!(—k)/(4!) 3 (2m 2 ) 4 ]/ 4 in the baby problem. 

Recall that we embarked on this calculation in order to produce two mesons by sources 
localized in regions 1 and 2, watch them scatter off each other, and then get rid of them 
with sinks localized in regions 3 and 4. In other words, we set the source function J(x) 
equal to a set of delta functions peaked at x lt x 2 , x 3 , and x 4 . This can be immediately 
read off from (19): the scattering amplitude is just —iX f D lw D 2w D iw D 4w , exactly as 
in (17). 

The result is very easy to understand (see figure 1.7.7a). Two mesons propagate from 
their birthplaces at X\ and x 2 to the spacetime point w, with amplitude D(x 4 — w)D(x 2 — 
w ), scatter with amplitude —iX, and then propagate from w to their graves at x 3 and x 4 , 
with amplitude D(w — x 3 )D(w — x 4 ) [note that D{x) — D(—x)]. The integration over w 
just says that the interaction point w could have been anywhere. Everything is as in the 
child problem. 

It is really pretty simple once you get it. Still confused? It might help if you think of (12) 
as some kind of machine g _( '/4!u f d w[S/iSJ(w)] 0 p era tj n g on 


Z(j x = 0) = e - *-'/ 2 * // d * xdA y J W 


When expanded out, Z(J, X — 0) is a bunch of J’s thrown here and there in spacetime, 
with pairs of J’s connected by D’s. Think of a bunch of strings with the string ends 
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Figure 1.7.8 


corresponding to the J’s. What does the “machine” do? The machine is a sum of terms, 
for example, the term 


X 2 



d A w 2 


& 

4 

5 

_SJ( w x )_ 


. SJ(w 2 )_ 


When this term operates on a term in Z( J, X — 0) it grabs four string ends and glues them 
together at the point w 2 , then it grabs another 4 string ends and glues them together at 
the point . The locations Wj and w 2 are then integrated over. Two examples are shown 
in figure 1.7.8. It is a game you can play with a child! This childish game of gluing four 
string ends together generates all the Feynman diagrams of our scalar field theory. 


Do it once and for all 

Now Feynman comes along and says that it is ridiculous to go through this long-winded 
yakkety-yak every time. Just figure out the rules once and for all. 
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For example, for the diagram in figure 1.7.7a associate the factor —IT. with the scattering, 
the factor D(x-j — w) with the propagation from x 1 to w, and so forth—conceptually exactly 
the same as in our baby problem. See, you could have invented Feynman diagrams. (Well, 
not quite. Maybe not, maybe yes.) 

Just as in going from (1.4.1) to (1.4.2), it is easier to pass to momentum space. Indeed, 
that is how experiments are actually done. A meson with momentum ki and a meson with 
momentum k 2 collide and scatter, emerging with momenta k 2 and k 4 (see figure 1.7.7b). 
Each spacetime propagator gives us 

d A k n e±ika( x a-w) 


D(x a - w) = 


/ 


(2n) A k 2 — m 2 + ie 

Note that we have the freedom of associating with the dummy integration variable either 
a plus or a minus sign in the exponential. Thus in integrating over w in (17) we obtain 


/ 


d A we~ i( - kl+kl ~ k3 ~ k * )w = (27r) 4 <5 <4) (ki + k 2 - k 3 - k 4 ). 


That the interaction could occur anywhere in spacetime translates into momentum con¬ 
servation k 4 + k 2 — k 2 + k 4 . (We put in the appropriate minus signs in two of the D’s so 
that we can think of k 2 and k 4 as outgoing momenta.) 

So there are Feynman diagrams in real spacetime and in momentum space. Spacetime 
Feynman diagrams are literally pictures of what happened. (A trivial remark: the orien¬ 
tation of Feynman diagrams is a matter of idiosyncratic choice. Some people draw them 
with time running vertically, others with time running horizontally. We follow Feynman 
in this text.) 


The rules 

We have thus derived the celebrated momentum space Feynman rules for our scalar field 
theory: 

1. Draw a Feynman diagram of the process (fig. 1.7.7b for the example we just discussed). 

2. Label each line with a momentum k and associate with it the propagator i / (k 2 — m 2 + ie). 

3. Associate with each interaction vertex the coupling —ik and (2n) A <5 f4 TXa k t — JA kj), 
forcing the sum of momenta JA k t coming into the vertex to be equal to the sum of momenta 

kj going out of the vertex. 

j4l 

4. Momenta associated with internal lines are to be integrated over with the measure r . 
Incidentally, this corresponds to the summation over intermediate states in ordinary per¬ 
turbation theory. 

5. Finally, there is a rule about some really pesky symmetry factors. They are the analogs of 
those numerical factors in our baby problem. As a result, some diagrams are to be multiplied 
by a symmetry factor such as \. These originate from various combinatorial factors counting 
the different ways in which the (8/SJ)’s can hit the J’s in expressions such as (18). I will let 
you discover a symmetry factor in exercise 1.7.2. 

We will illustrate by examples what these rules (and the concept of internal lines) mean. 
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Our first example is just the diagram (fig. 1.7.7b) that we have calculated. Applying the 
rules we obtain 

-iA( 27 r) 4 < 5 ( 4 ) (fc 1 + k 2 - h ~ *♦) 11 ( To - T -“ 

\\k z — m z + is 

a=1 \ a 

You would agree that it is silly to drag the factor {^ T-m 2 +ic J aroun d, since it would 

be common to all diagrams in which two mesons scatter into two mesons. So we append 
to the Feynman rules an additional rule that we do not associate a propagator with external 
lines. (This is known in the trade as “amputating the external legs”.) 

In an actual scattering experiment, the external particles are of course real and on shell, 
that is, their momenta satisfy k 2 — m 2 = 0. Thus we better not keep the propagators of the 
external lines around. Arithmetically, this amounts to multiplying the Green’s functions 
[and what we have calculated thus far are indeed Green’s functions, see (16)] by the factor 
n a(~i)(kj — m 2 ) and then set k 2 — m 2 (“putting the particles on shell” in conversation). At 
this point, this procedure sounds like formal overkill. We will come back to the rationale 
behind it at the end of the next chapter. 

Also, since there is always an overall factor for momentum conservation we should not 
drag the overall delta function around either. Thus, we have two more rules: 

6. Do not associate a propagator with external lines. 

7. A delta function for overall momentum conservation is understood. 

Applying these rules we obtain an amplitude which we will denote by AT For example, 
for the diagram in figure 1.7.7b A4 — —iX. 


The birth of particles 


As explained in chapter 1.1 one of the motivations for constructing quantum field theory 
was to describe the production of particles. We are now ready to describe how two col¬ 
liding mesons can produce four mesons. The Feynman diagram in figure 1.7.9 (compare 
fig. 1.7.3a) can occur in order X 2 in perturbation theory. Amputating the external legs, we 
drop the factor fTIji /k 2 — m 2 + is] associated with the six external lines, keeping only 
the propagator associated with the one internal line. For each vertex we put in a factor of 
(—iX) and a momentum conservation delta function (rule 3). Then we integrate over the 
internal momentum q (rule 4) to obtain 

(-iX) 2 [ -— (2tt ) 4 ^ 4 \k x + k 2 -k 3 - q)(2n) 4 8 i4) [q - (k 4 + k 5 + * 6 )] (20) 

J (27t) 4 q l — m z + is 

The integral over q is a cinch, giving 


(-iX) 2 


i 

(k 4 + £5 + k b ) 2 — m 2 + is 


(27r) 4 5 <4) [^ 1 + k 2 - (k 3 + k 4 + k 5 + k 6 )] 


( 21 ) 
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We have already agreed (rule 7) not to drag the overall delta function around. This exam¬ 
ple teaches us that we didn’t have to write down the delta functions and then annihilate (all 
but one of) them by integrating. In figure 1.7.9 we could have simply imposed momentum 
conservation from the beginning and labeled the internal line as k 4 + k 5 + k 6 instead of q. 

With some practice you could just write down the amplitude 


M = i-iX) 2 


i 

(k 4 + k 5 + k 6 ) 2 — m 2 + ie 


( 22 ) 


directly: just remember, a coupling (— iX ) for each vertex and a propagator for each internal 
(but not external) line. Pretty easy once you get the hang of it. As Schwinger said, the masses 
could do it. 


The cost of not being real 

The physics involved is also quite clear: The internal line is associated with a virtual particle 
whose relativistic 4-momentum k 4 + k$ + k b squared is not necessarily equal to m 2 , as it 
would have to be if the particle were real. The farther the momentum of the virtual particle 
is from the mass shell the smaller the amplitude. You are penalized for not being real. 

According to the quantum rules for dealing with identical particles, to obtain the full 
amplitude we have to symmetrize among the four final momenta. One way of saying it is 
to note that the line labeled in figure (1-7.9) could have been labeled k 4 , k 5 , or k 6 , and 
we have to add all four contributions. 

To make sure that you understand the Feynman rules I insist that you go through the 
path integral calculation to obtain (21) starting with (12) and (13). 

I am repeating myself but I think it is worth emphasizing again that there is nothing 
particularly magical about Feynman diagrams. 
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Loops and a first look at divergence 


Just as in our baby problem, we have tree diagrams and loop diagrams. So far we have only 
looked at tree diagrams. Our next example is the loop diagram in figure 1.7.10 (compare 
fig. 1.7.2a.) Applying the Feynman rules, we obtain 


\{-m 2 


/ 


a A k 

(2tz) 4 k 2 


i i 

m 2 + is (ki + k 2 — k) 2 — m 2 + is 


(23) 


As above, the physics embodied in (23) is clear: As k ranges over all possible values, the 
integrand is large only if one or the other or both of the virtual particles associated with 
the two internal lines are close to being real. Once again, there is a penalty for not being 
real (see exercise 1.7.4). 

For large k the integrand goes as 1 /k 4 . The integral is infinite! It diverges as f d 4 k ( 1 / k 4 ). 
We will come back to this apparent disaster in chapter III. 1. 

With some practice, you will be able to write down the amplitude by inspection. As 
another example, consider the three-loop diagram in figure 1.7.11 contributing in 0(X 4 ) to 
meson-meson scattering. First, for each loop pick an internal line and label the momentum 
it carries, p , q , and r in our example. There is considerable freedom of choice in labeling— 
your choice may well not agree with mine, but of course the physics should not depend 
on it. The momenta carried by the other internal lines are then fixed by momentum 
conservation, as indicated in the figure. Write down a coupling for each vertex, and a 
propagator for each internal line, and integrate over the internal momenta p, q, and r. 
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Thus, without worrying about symmetry factors, we have the amplitude 


(-»*)' 


/ 


d 4 p d 4 q d 4 r 


(In) 4 (2n) 4 (2n) 4 p 2 — m 2 + ie ( k 1 + k 2 — p) 2 — m 2 + is 


q 2 — m 2 + is (p — q — r) 2 — m 2 + is r 2 — m 2 + is ( k j + k 2 — r) 2 — m 2 + is 


(24) 


Again, this triple integral also diverges: It goes as f d 12 P(l/P 12 ). 


An assurance 

When I teach quantum field theory, at this point in the course some students get un¬ 
accountably very anxious about Feynman diagrams. I would like to assure the reader that 
the whole business is really quite simple. Feynman diagrams can be thought of simply as 
pictures in spacetime of the antics of particles, coming together, colliding and producing 
other particles, and so on. One student was puzzled that the particles do not move in 
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straight lines. Remember that a quantum particle propagates like a wave; D(x — y) gives 
us the amplitude for the particle to propagate from x to y . Evidently, it is more convenient to 
think of particles in momentum space: Fourier told us so. We will see many more examples 
of Feynman diagrams, and you will soon be well acquainted with them. Another student 
was concerned about evaluating the integrals in (23) and (24). I haven’t taught you how yet, 
but will eventually. The good news is that in contemporary research on the frontline few 
theoretical physicists actually have to calculate Feynman diagrams getting all the factors 
of 2 right. In most cases, understanding the general behavior of the integral is sufficient. 
But of course, you should always take pride in getting everything right. In chapters II.6, 
III.6, and III.7 I will calculate Feynman diagrams for you in detail, getting all the factors 
right so as to be able to compare with experiments. 

Vacuum fluctuations 

Let us go back to the terms I neglected in (18) and which you are supposed to have 
figured out. For example, the four [S/<5/(w/)]’s could have hit J c , J d , J g , and J h in (18) 
thus producing something like 

iX j 111 D a eDbfJ a JbJ e Jf(J D ww D ww ) . 

The coefficient of J(xi)J(x 2 )J(x 2 )J(x 4 ) is then D 12 D 24 (—iX f D WW D WW ) plus terms 
obtained by permuting. 

The corresponding physical process is easy to describe in words and in pictures (see 
figure 1.7.12). The source at x 2 produces a particle that propagates freely without any 
interaction to x 3 , where it comes to a peaceful death. The particle produced at x 2 leads 
a similarly uneventful life before being absorbed at x 4 . The two particles did not interact at 
all. Somewhere off at the point w, which could perfectly well be anywhere in the universe, 
there was an interaction with amplitude —iX. This is known as a vacuum fluctuation: 



x 


Figure 1.7.12 
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As explained in chapter 1.1, quantum mechanics and special relativity inevitably cause 
particles to pop out of the vacuum, and they could even interact before vanishing again 
into the vacuum. Look at different time slices (one of which is indicated by the dotted line) 
in figure 1.7.12. In the far past, the universe has no particles. Then it has two particles, 
then four, then two again, and finally in the far future, none. We will have a lot more to 
say in chapter VIII.2 about these fluctuations. Note that vacuum fluctuations occur also in 
our baby and child problems (see, e.g., Figs. 1.7.lc, I.7.2g,h,i,j, and so forth). 


Two words about history 

I believe strongly that any self-respecting physicist should learn about the history of phys¬ 
ics, and the history of quantum field theory is among the most fascinating. Unfortunately, 
I do not have the space to go into it here. 4 The path integral approach to field theory using 
sources J{x) outlined here is mainly associated with Julian Schwinger, who referred to it 
as “sorcery” during my graduate student days (so that I could tell people who inquired that 
I was studying sorcery in graduate school.) In one of the many myths retold around tribal 
fires by physicists, Richard Feynman came upon his rules in a blinding flash of insight. 
In 1949 Freeman Dyson showed that the Feynman rules which so mystified people at the 
Pocono conference a year earlier could actually be obtained from the more formal work of 
Julian Schwinger and of Shin-Itiro Tomonaga. 


Exercises 

1. 7.1 Work out the amplitude corresponding to figure 1.7.11 in (24). 

1. 7.2 Derive (23) from first principles, that is from (11). It is a bit tedious, but straightforward. You should find 
a symmetry factor \. 

1. 7.3 Draw all the diagrams describing two mesons producing four mesons up to and including order A 2 . 
Write down the corresponding Feynman amplitudes. 

1. 7.4 By Lorentz invariance we can always take + k 2 = (E, 0) in (23). The integral can be studied as a function 
of E . Show that for both internal particles to become real E must be greater than 2m . Interpret physically 
what is happening. 


4 An excellent sketch of the history of quantum field theory is given in chapter 1 of The Quantum Theory of 
Fields by S. Weinberg. For a fascinating history of Feynman diagrams, see Drawing Theories Apart by D. Kaiser. 




Quantizing Canonically 


Quantum electrodynamics is made to appear more difficult 
than it actually is by the very many equivalent methods by 
which it may be formulated. 

—R. P. Feynman 

Always create before we annihilate, not the other way 
around. 

—Anonymous 


Complementary formalisms 

I adopted the path integral formalism as the quickest way to quantum field theory. But I 
must also discuss the canonical formalism, not only because historically it was the method 
used to develop quantum field theory, but also because of its continuing importance. 
Interestingly, the canonical and the path integral formalisms often appear complementary, 
in the sense that results difficult to see in one are clear in the other. 


Heisenberg and Dirac 

Let us begin with a lightning review of Heisenberg’s approach to quantum mechanics. 
Given a classical Lagrangian for a single particle L — \q 2 — V(q) (we set the mass equal to 
1), the canonical momentum is defined as p = SL/Sq — q. The Hamiltonian is then given 
by H = pq — L — p 1 1 2 + V(q). Heisenberg promoted position q(t) and momentum pit) 
to operators by imposing the canonical commutation relation 

[p,q\ = -i (1) 


Operators evolve in time according to 

d J = i[H,p]=-V\q) 
at 


( 2 ) 
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and 

< j-=i[H,q] = p (3) 

at 

In other words, operators constructed out of p and q evolve according to 0(t ) = 
e ,Ht O(0)e~ lHt . In (1) p and q are understood to be taken at the same time. We obtain 
the operator equation of motion q — — V'(q ) by combining (2) and (3). 

Following Dirac, we invite ourselves to consider at some instant in time the operator 
a = (1/^/2co)(a>q + ip) with some parameter &>. From (1) we have 

[a,a\ = \ (4) 

The operator a(t) evolves according to 

-j- = i[H, ~^=im + ip)] = —‘J~ (ip + -l 7 '!?)) 
at v 2co V 2 \ / 

-j- 

which can be written in terms of a and a . The ground state |0) is defined as the state such 
that a |0) = 0. 

In the special case V'{q) — co 2 q we get the particularly simple result ^ = — iwa. This is 
of course the harmonic oscillator L — \q 2 — \a> 2 q 2 and H — j(p 2 + w 2 q 2 ) — w(a' f a + \). 
The generalization to many particles is immediate. Starting with 

l = J2 ~ V («i. • • •. 9n) 

a 

we have p a — SL/Sq a and 


[Pati),q b (t)\= -i&ab 


(5) 


The generalization to field theory is almost as immediate. In fact, we just use our handy- 
dandy substitution table (1.3.6) and see that in D —dimensional space L generalizes to 


-/■ 


L= I d D x{\(<i) i — (y(p) z — m L <f L ) — u((p)} 


2,„2x 


( 6 ) 


where we denote the anharmonic term (the interaction term in quantum field theory) by 
u(cp). The canonical momentum density conjugate to the field tp(x, t) is then 

SL 


7l(x, t) = 


= 3 0 <P(x, t) 


(7) 


8<p(x, t) 

so that the canonical commutation relation at equal times now reads [using (1.3.6)] 

[n(x, t), (p(x r , f)] = [3 0 <p(x, t), ip{x', t)] = —i8 {D \x — x') (8) 

(and of course also [n(x, t ), nix', t )] = 0 and [<p(x, t), (p(x r , t )] = 0.) Note that 8 ab in (5) 
gets promoted to S (D) (x — x') in (8). You should check that (8) has the correct dimension. 
Turning the canonical crank we find the Hamiltonian 


H = J d D x[n(x, t)d 0 <p(x, t ) — £] 

= J d D x[j[n 2 + (V^o ) 2 + m 2 (p 2 ] + u((p)} 


( 9 ) 
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For the case u = 0, corresponding to the harmonic oscillator, we have a free scalar field 
theory and can go considerably farther. The field equation reads 

(3 2 + m 2 )(p = 0 (10) 


Fourier expanding, we have 

<p{x,t)= [ , (h^i-k-xh (H) 

J y (2rc) D 2co k 

with <x> k = +y/k 2 + m 2 so that the field equation (10) is satisfied. The peculiar looking factor 
( 2 co k )~2 is chosen so that the canonical relation 


[a(k),a\k')] = S (D \k-k') 


( 12 ) 


appropriate for creation and annihilation operators implies the canonical commutation 
[3o^(3c, t), (p(x', 0] = —i8 (D \x — x') in (8 ). You should check this but you can see why the 
factor (2co k )~2 is needed since in 3 0 <p a factor co k is brought down from the exponential. 

As in quantum mechanics, the vacuum or ground state | |0) is defined by the condition 
a(k) |0) = 0 for all k and the single particle state by \k) = a { (k) |0). Thus, for example, 
using (12) we have (0| <p{x, t) \k) — (1/(2Tt) D 2(io k )e~ 1 ^ k, ~ k ' x \ which we could think of 
as the relativistic wave function of a single particle with momentum k. For later use, we 
will write this more compactly as (1 /p(k))e~ lk ’ x , with p(k) = (2jr) D 2co k a normalization 
factor and k° — a> k . 

To make contact with the path integral formalism let us calculate (0| (p(x , t)(p{ 0, 0) |0} 

"I" "f” "f” "j" 

for t > 0. Of the four terms a a , a a, a a , and aa in the product of the two fields 
only aa' survives, and thus using (12) we obtain f[d D k/(2jt) D 2<jL> k ]e~'^ OJkt ~ k ' x \ In other 
words, if we define the time-ordered product T[(p{x)(p{y)\ = 6(x° — y°)<p(x)(p(y) + 8(y° — 
x°)(p(y)(p(x), we find 

(0|J[¥»(jc,O<i>(0,0)]|0) = 

f d ° k \G( + d( _ t)e +Ho> k t-k-xh (13 ) 

J {2n) D 2(o k 1 


Go back to (1.3.23). We discover that (0| T[<p(x)<p( 0)]|0) = iD(x), the propagator for a 
particle to go from 0 to x we obtained using the path integral formalism. This further 
justifies the is prescription in (1.3.22). The physical meaning is that we always create 
before we annihilate, not the other way around. This is a form of causality as formulated 
in quantum field theory. 

A remark: The combination d D k/(2co k ), even though it does not look Lorentz invariant, 
is in fact a Lorentz invariant measure. To see this, we use (1.2.13) to show that (exercise 1.8.1) 

f d {D+1) kS(k 2 - ?n 2 )9(k°)f(k°,k) = f — f{co k ,k) (14) 

J J 2a) k 


for any function f(k). Since Lorentz transformations cannot change the sign oik 0 , the step 
function 9 (k°) is Lorentz invariant. Thus the left-hand side is manifestly Lorentz invariant, 
and hence the right-hand side must also be Lorentz invariant. This shows that relations 
such as (13) are Lorentz invariant; they are frame-independent statements. 



64 | I- Motivation and Foundation 


Scattering amplitude 

Now that we have set up the canonical formalism it is instructive to see how the invariant 
amplitude M defined in the preceding chapter arises using this alternative formalism. Let 
us calculate the amplitude (k 3 k 4 \ e~' HT \kik 2 ) — (kik 4 \ e ' ^ d xCix} \kik 2 ) for meson scatter¬ 
ing k 1 + k 2 -> k i + k 4 in order X with u(<p) = |j<p 4 . (We have, somewhat sloppily, turned 
the large transition time T into f dx° when going over to the Lagrangian.) Expanding in 
X, we obtain (—i f d 4 x(k 2 k 4 \ (p 4 (x) \k]k 2 ). 

The calculation of the matrix element is not dissimilar to the one we just did for the 
propagator. There we have the product of two field operators between the vacuum state. 
Here we have the product of four field operators, all evaluated at the same spacetime point 
x, sandwiched between two-particle states. There we look for a term of the form a(k)a Hk), 
Here, plugging the expansion (11) of the field into the product <p 4 (x ), we look for terms of 
the form a' (k 4 )a f (k 2 )a(k 2 )a(k{), so that we could remove the two incoming particles and 
produce the two outgoing particles. (To avoid unnecessary complications we assume that 
all four momenta are different.) We now annihilate and create. The annihilation operator 
a(k{) could have come from any one of the four <p fields in tp 4 , giving a factor of 4, a(k{) 
could have come from any one of the three remaining (p fields, giving a factor of 3, a(k 4 ) 
could have come from either of the two remaining <p fields, giving a factor of 2, so that we 
end up with a factor of 4!, which cancels the factor of ^ included in the definition of X. 
(This is of course why, for the sake of convenience, X is defined the way it is. Recall from 
the preceding chapter an analogous step.) 

As you just learned and as you can see from (11), we obtain a factor of 1 /p(k)e~ ,k ' x for 
each incoming particle and of 1 /p(k)e +lk ' x for each outgoing particle, giving all together 

( n *-^) / +*. - - « 

It is conventional to refer to S f t — (f\e~' HT \i), with some initial and final state, as 
elements of an “5-matrix” and to define the “transition matrix” T -matrix by S — I + iT , 
that is, 

Sfi = 8fi + iTfi (15) 

In general, for initial and final states consisting of scalar particles characterized only by 
their momenta, we write (using an obvious notation, for example JT k is the sum of the 
particle momenta in the initial state), invoking momentum conservation: 

iT fi = (2*)V k ~ E ( n “^) M(f ^ 0 (16> 

In our simple example, /T(k 3 k 4 , k^k 2 ) = (— i '|j) / d 4 x(k 3 k 4 \(p 4 (x ) \kik 2 ), and our little cal¬ 
culation showed that A4 — —iX, precisely as given in the preceding chapter. But this con¬ 
nection between T and A4, being “merely” lcinematical, should hold in general, with the 
invariant amplitude A4 determined by the Feynman rules. I will not give a long boring for- 
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mal proof, but you should check this assertion by working out some more involved cases, 
such as the scattering amplitude to order X 2 , recovering (1.7.23), for example. 

Thus, quite pleasingly, we see that the invariant amplitude A4 determined by the 
Feynman rules represents the “heart of the matter” with the momentum conservation 
delta function and normalization factors stripped away. 

My pedagogical aim here is merely to make one more contact (we will come across 
more in later chapters) between the canonical and path integral formalisms, giving 
the simplest possible example avoiding all subtleties and complications. Those readers 
into rigor are invited to replace the plane wave states \k 1 k 2 ) with wave packet states 
/ d 3 k 1 f d 2 k 2 f\(k-\)f 2 (k 2 ) \k\k 2 ) for some appropriate functions f\ and f 2 , starting in the 
far past when the wave packets were far apart, evolving into the far future, so on and so 
forth, all the while smiling with self-satisfaction. The entire procedure is after all no differ¬ 
ent from the treatment of scattering 1 in elementary nonrelativistic quantum mechanics. 


Complex scalar field 

Thus far, we have discussed a hermitean (often called real in a minor abuse of terminology) 
scalar field. Consider instead (as we will in chapter 1.10) a nonhermitean (usually called 
complex in another minor abuse) scalar field governed by C — drp 3 <p — m z (p <p. 

Again, following Heisenberg, we find the canonical momentum density conjugate to 
the field <p(x, t), namely n(x, t) — SL/[8(p(x, t)] — 3 'o<p'(x, t), so that [nix, t), <p(x', /)] = 
[3o^(3c, t), q>{x' , t)\ — —i8 (D \x — x'). Similarly, the canonical momentum density conju¬ 
gate to the field <p ' ( (x, t ) is 3 o<p(x, t). 

Varying yd we obtain the Euler-Lagrange equation of motion (3 2 + m 2 )cp — 0. [Similarly, 
varying <p we obtain (3 2 + m 2 )(p t = 0.] Once again, we could Fourier expand, but now the 
nonhermiticity of (p means that we have to replace (11) by 

<pix, t) = [ d ° k \a(k)e-‘ (mk, - U) + b^ik)e Ko>k, - U) A (17) 

J y/ (2n) D 2a> k L -I 

In (11) hermiticity fixed the second term in the square bracket in terms of the first. Here 
in contrast, we are forced to introduce two independent sets of creation and annihilation 

-j- -j- 

operators (a, a ) and (b, b ). You should verify that the canonical commutation relations 
imply that these indeed behave like creation and annihilation operators. 

Consider the current 

J ll = ii<p t d tl <p-d tl <p t (p) (18) 

Using the equations of motion you should check that = i((p f d 2 (p — d 2 ip [ \p). (This 
follows immediately from the fact that the equation of motion for yd is the hermitean 


1 For example, M. L. Goldberger and K. M. Watson, Collision Theory. 
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conjugate of the equation of motion for p.) The current is conserved and the corresponding 
time-independent charge is given by (verify this!) 

Q = J d D xJ 0 {x) = j d D k[a f (k)a(k) — b 1 (k)b(k)]. 

Thus the particle created by a (call it the “particle”) and the particle created by b ( (call it 
the “antiparticle”) carry opposite charges. Explicitly, using the commutation relation we 
have <2 fl l|0) = +a ' |0) and Qb' f \0) = — £>1'|0). 

-j- 

We conclude that <p creates a particle and annihilates an antiparticle, that is, it produces 
one unit of charge. The field cp does the opposite. You should understand this point 
thoroughly, as we will need it when we come to the Dirac field for the electron and positron. 


The energy of the vacuum 


As an instructive exercise let us calculate in the free scalar field theory the expectation 
value (0| H |0) = / d D x\( 0| (n 2 + (V^) 2 + m 2 ip 2 ) |0), which we may loosely refer to as the 
“energy of the vacuum.” It is merely a matter of putting together (7), (11), and (12). Let us 
focus on the third term in (0| H |0), which in fact we already computed, since 


(0| <p{x, t)<p(x, t ) |0) = (0| <p( 0, 0)<p(0, 0) |0) 


= lim (0| (p(x, t)<p(0, 0) |0) = lim 
x,t^-b,o x,t-+o,o 


/ 


d°k 


—iico^t—k'x) . 


f 


d D k 


(27T) D 2co k 


{2n) D 2o) k 

The first equality follows from translation invariance, which also implies that the factor 
f d D x in (0| H |0) can be immediately replaced by V, the volume of space. The calculation 
of the other two terms proceeds in much the same way: for example, the two factors of V 
in (Vv 3 ) 2 just bring down a factor of k 2 . Thus 


<0|tf |0) = 




d u k 


(2n) D 2cu k 2 


- («? + k 2 + m 2 ) = 




d D k 1 


(2n) D 2 


r, ba> k 

D K 


(19) 


upon restoring h. 

You should find this result at once gratifying and alarming, gratifying because we recog¬ 
nize it as the zero point energy of the harmonic oscillator integrated over all momentum 
modes and over all space, and alarming because the integral over k clearly diverges. But we 
should not be alarmed: the energy of any physical configuration, for example the mass of a 
particle, is to be measured relative to the “energy of the vacuum.” We ask for the difference 
in the energy of the world with and without the particle. In other words, we could simply 
define the correct Hamiltonian to be H — (0| H |0). We will come back to some of these 
issues in chapters II.5, III.l, and VIII.2. 


Nobody is perfect 

In the canonical formalism, time is treated differently from space, and so one might worry 
about the Lorentz invariance of the resulting field theory. In the standard treatment given 
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in many texts, we would go on from this point and use the Hamiltonian to generate the 
dynamics, developing a perturbation theory in the interaction u((p). After a number of 
formal steps, we would manage to derive the Feynman rules, which manifestly define a 
Lorentz-invariant theory. 

Historically, there was a time when people felt that quantum field theory should be 
defined by its collection of Feynman rules, which gives us a concrete procedure to calculate 
measurable quantities, such as scattering cross sections. An extreme view along this line 
held that fields are merely mathematically crutches used to help us arrive at the Feynman 
rules and should be thrown away at the end of the day. 

This view became untenable starting in the 1970s, when it was realized that there 
is a lot more to quantum field theory than Feynman diagrams. Field theory contains 
nonperturbative effects invisible by definition to Feynman diagrams. Many of these effects, 
which we will get to in due time, are more easily seen using the path integral formalism. 

As I said, the canonical and the path integral formalism often appear to be complemen¬ 
tary, and I will refrain from entering into a discussion about which formalism is superior. 
In this book, I adopt a pragmatic attitude and use whatever formalism happens to be easier 
for the problem at hand. 

Let me mention, however, some particularly troublesome features in each of the two 
formalisms. In the canonical formalism fields are quantum operators containing an in¬ 
finite number of degrees of freedom, and sages once debated such delicate questions as 
how products of fields are to be defined. On the other hand, in the path integral formal¬ 
ism, plenty of sins can be swept under the rug known as the integration measure (see 
chapter IV.7). 


Appendix ^ 


It may seem a bit puzzling that in the canonical formalism the propagator has to be defined with time ordering, 
which we did not need in the path integral formalism. To resolve this apparent puzzle, it suffices to look at 
quantum mechanics. 

Let A[q(ti)] be a function of q, evaluated at time t\. What does the path integral f Dq{t) A[q(t{)]e Jo dtL ^ q,q ^ 
represent in the operator language? Well, working backward to (1.2.4) we see that we would slip A[q(t{)\ into the 
factor {qj+i\ e~ lH8t \qj), where the integer j is determined by the condition that the time t\ occurs between the 
times j8t and (j + l)8t. In the resulting factor (qj + fi e~ lHSt A[q(t{)]\qj), we could replace the c-number A[q(t{)] 
by the operator A[q], since A[q]\qj) = A[qj\\qj) ~ A[q(t{)\\qj) to the accuracy we are working with. Note that q is 
evidently a Schrodinger operator. Thus, putting in this factor {qj+ il e~ lH8t A[q] \qj) together with all the factors of 

r T 

e~ ,HSl |^,->, we find that the integral in question, namely f Dq(t) A[q(t{)]e' Jo dlL ( q ’ q \ actually represents 

<9fI e- iHiT -^A[q]e- iH ^ | q,) = (q F \ e~ iHT A[q (q)] | q,) 

where q is now evidently a Heisenberg operator. [We have used the standard relation between Heisenberg 
and Schrodinger operators, namely, 0 H (t) = e lHt O s e~ lHt .] I find this passage back and forth between the Dirac, 
Schrodinger, and Heisenberg pictures quite instructive, perhaps even amusing. 

We are now prepared to ask the more complicated question: what does the path integral 
r T 

f Dq{t ) A[q(ti)]B[q(t 2 )]e l re p resen t in the operator language? Here B[q(t 2 )] is some other function of 

q evaluated at time t 2 . So we also slip B[q(t 2 )\ into the appropriate factor in (1.2.4) and replace B[q(t 2 )] by B[q]. 
But now we see that we have to keep track of whether t\ or t 2 is the earlier of the two times. If t 2 is earlier, the 
operator A[q] would appear to the left of the operator B[q\, and if is earlier, to the right. Explicitly, if t 2 is earlier 
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than ^,we would end up with the sequence 

e - iH(T - h) A [q]e~ iH(h ~ h) B[q]e- iHh = e~ iHT A[q{t{)]B[q(t 2 )] (20) 

upon passing from the Schrodinger to the Heisenberg picture, just as in the simpler situation above. Thus we 
define the time-ordered product 

T[A[q(t x )]B[q(t 2 )]} ee 0{t x - tj)A[q(td\B[q{tji\ + 0(t 2 - tOB[q(t^]A[q(t^] (21) 

We just learned that 

(q F \e- iHT T[A[q(t 1 )]B[q(t 2 )]]\q I ) = J Dq(t) A[q(.t J )]B[q(.t ] )ie i fo dtLii ’ q) (22) 

The concept of time ordering does not appear on the right-hand side, but is essential on the left-hand side. 

Generalizing the discussion here, we see that the Green’s functions G^ n \x lf x 2 , • • •, x n ) introduced in the 
preceding chapter [see (1.7.13-15)] is given in the canonical formalism by the vacuum expectation value of a time- 
ordered product of field operators {0\T {(p{x{)(p{x{) • • • #>(.*;„)} |0). That (13) gives the propagator is a special case 
of this relationship. 

We could also consider (0| T*{C? 1 (jc 1 )0 2 (jc2) ' * * O n {x n )}\0), the vacuum expectation value of a time-ordered 
product of various operators 0 ,-(jc) [the current J^ix), for example] made out of the quantum field. Such objects 
will appear in later chapters [for example, (VII.3.7)]. 


Appendix 2: Field redefinition 


This is perhaps a good place to reveal to the innocent reader that there does not exist an international commission 
in Brussels mandating what field one is required to use. If we use (p , some other guy is perfectly entitled to use 
77, assuming that the two fields are related by some invertible function with 77 = f((p). (To be specific, it is often 
helpful to think of 77 = (p + cup 3 with some parameter a.) This is known as a field redefinition, an often useful 
thing to do, as we will see repeatedly. 

The S-matrix amplitudes that experimentalists measure are invariant under field redefinition. But this is 
tautological trivia: the scattering amplitude (k^k^l e~ lHT \kik 2 ), for example, does not even know about <p and 77. 
The issue is with the formalism we use to calculate the ^-matrix. 

In the path integral formalism, it is also trivial that we could write Z(J) = f Dr] e l ^ s ^ + f d xJ just as well 
as Z(J) = f Dtp gmvnfd'xJtp] ^ result a mere change of integration variable, was known to Newton and 
Leibniz. But suppose we write Z(J) = f Dcp d xJr} \ Now of course any dolt could see that Z(J ) ^ Z(J), 

and a fortiori, the Green’s functions (1.7.14,15) obtained by differentiating Z(J) and Z(J ) are not equal. 

The nontrivial physical statement is that the ^-matrix amplitudes obtained from Z(J) and Z(J) are in fact 
the same. This better be the case, since we are claiming that the path integral formalism provides a way to actual 
physics. To see how this apparent “miracle" (Green’s functions completely different, ^-matrix amplitudes the 
same) occurs, let us think physically. We set up our sources to produce or remove one single field disturbance, 
as indicated in figure 1.4.1. Our friend, who uses Z(J), in contrast, set up his sources to produce or remove 
77 = rp + cup 3 (we specialize for pedagogical clarity), so that once in a while (with a probability determined by a) 
he is producing three field disturbances instead of one, as shown in figure 1.8.1. As a result, while he thinks 
that he is scattering four mesons, occasionally he is actually scattering six mesons. (Perhaps he should give his 
accelerator a tune up.) 

But to obtain ^-matrix amplitudes we are told to multiply the Green’s functions by ( k 2 — m 2 ) for each external 
leg carrying momentum k, and then set k 2 to m 2 . When we do this, the diagram in figure 1.8.la survives, since 
it has a pole that goes like 1 /{k 2 — m 2 ) but the extraneous diagram in figure 1.8.lb is eliminated. Very simple. 
One point worth emphasizing is that m here is the actual physical mass of the particle. Let’s be precise when we 
should. Take the single particle state |&). Act on it with the Hamiltonian. Then H |£) = y/k 1 + m 2 \k). The m that 
appears in the eigenvalue of the Hamiltonian is the actual physical mass. We will come back to the issue of the 
physical mass in chapter 111.3. 

In the canonical formalism, the field is an operator, and as we saw just now, the calculation of S-matrix 
amplitudes involves evaluating products of field operators between physical states. In particular, the matrix 
elements {k\ <p |0) and (0| (p |£) (related by hermitean conjugation) come in crucially. If we use some other field 77, 
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(a) 

Figure 1.8.1 


(b) 


what matters is merely that (k | 77 10) is not zero, in which case we could always write (k \ r\ |0) = Z 2 (£| <p |0) with 
Z some c-number. We simply divide the scattering amplitude by the appropriate powers of Z 2 . 


Exercises 

1.8.1 Derive (14). Then verify explicitly that d D k/(2coi c ) is indeed Lorentz invariant. Some authors prefer to 
replace ^J2co k in ( 11 ) by 2 co k when relating the scalar field to the creation and annihilation operators. 
Show that the operators defined by these authors are Lorentz covariant. Work out their commutation 
relation. 

1. 8.2 Calculate ( k'\ H \k), where |k) = a^(k) |0). 

1. 8.3 For the complex scalar field discussed in the text calculate (0| T[<p(x)q> (0)] |0). 

1. 8.4 Show that [Q, <p(.x)] = —(p(x). 



Disturbing the Vacuum 


Casimir effect 

In the preceding chapter, we computed the energy of the vacuum (0| H |0) and obtained 
the gratifying result that it is given by the zero point energy of the harmonic oscillator 
integrated over all momentum modes and over space. I explained that the energy of any 
physical configuration, for example, the mass of a particle, is to be measured relative to 
this vacuum energy. In effect, we simply subtract off this vacuum energy and define the 
correct Hamiltonian to be H — (0| H |0). 

But what if we disturb the vacuum? 

Physically, we could compare the energy of the vacuum before and after we introduce 
the disturbance, by varying the disturbance for example. Of course, it is not just our 
textbook scalar field that contributes to the energy of the vacuum. The electromagnetic 
field, for instance, also undergoes quantum fluctuation and would contribute, with its 
two polarization degrees of freedom, to the energy density e of the vacuum the amount 
2 f d^k /In 1948 Casimir had the brilliant insight that we could disturb the 
vacuum and produce a shift As. While s is not observable. As should be observable since 
we can control how we disturb the vacuum. In particular, Casimir considered introducing 
two parallel “perfectly” conducting plates (formally of zero thickness and infinite extent) 
into the vacuum. The variation of As with the distance d between the plates would lead to 
a force between the plates, known as the Casimir force. In reality, it is the electromagnetic 
field that is responsible, not our silly scalar field. 

Call the direction perpendicular to the plates the x axis. Because of the boundary 
conditions the electromagnetic field must satisfy on the conducting plates, the wave vector 
k can only take on the values ( nn/d, k y , k z ), with n an integer. Thus the energy per unit 
area between the plates is changed to J2 n / dk y dk z / (In) 2 J\itn / d) 2 + k y + kj. 

To calculate the force, we vary d , but then we would have to worry about how the energy 
density outside the two plates varies. A clever trick exists for avoiding this worry: we 
introduce three plates! See figure (1.9.1). We hold the two outer plates fixed and move 
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Figure 1.9.1 


only the inner plate. Now we don’t have to worry about the world outside the plates. The 
separation L between the two outer plates can be taken to be as large as we like. 

In the spirit of this book (and my philosophy) of avoiding computational tedium as 
much as possible, I propose two simplifications: (I) do the calculation for a massless scalar 
field instead of the electromagnetic field so we won’t have to worry about polarization 
and stuff like that, and (II) retreat to a (1 + l)-dimensional spacetime so we won’t have 
to integrate over k y and k z . Readers of quantum field theory texts do not need to watch 
their authors show off their prowess in doing integrals. As you will see, the calculation is 
exceedingly instructive and gives us an early taste of the art of extracting finite physical 
results from seemingly infinite expressions, known as regularization, that we will study 
in chapters III.1-3. 

With this set-up, the energy E = f(d ) + f(L — d ) with 

OO OO 

/ ( <o = -2> = (1) 

n =1 n =1 

since the modes are given by sin(«7rx/d) (n = 1, • • •, 00 ) with the corresponding energy 
co n — nn Id. 

OO 

Aagh! What do we do with Y n ? None of the ancient Greeks from Zeno on could tell us. 

n =1 

What they should tell us is that we are doing physics, not mathematics! Physical plates 
cannot keep arbitrarily high frequency waves from leaking out. 1 

To incorporate this piece of all-important physics, we should introduce a factor e~ aa>n ^ 
with a parameter a having the dimension of time (or in our natural units, length) so 
that modes with m n ^> n/a do not contribute: they don’t see the plates! The characteristic 


1 See footnote 1 in chapter III.l. 
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frequency n/a parametrizes the high frequency response of the conducting plates. Thus 
we have 


'“■sE 


—an/d . 


n =1 


n 3 
2 3a 


OO 



n=l 


7T 3 1 

2 da 1 - 


7T 

2rf (e a / d - l) 2 


Since we want a 1 to be large, we take the limit a small so that 


f(d) = 


nd 
2 a 2 


7T 7Tfl . , i , .c. 

-1- - + O (a d ). 

2Ad 480J 3 


( 2 ) 


Note that f(d) blows up as a -> 0, as it should, since we are then back to (1). But the force 
between two conducting plates shouldn’t blow up. Experimentalists might have noticed it 
by now! 

Well, the force is given by 


F = -— = -lf'(d) - f\L -d)} = -\ ( —*— + + • • ■) - (d -* L - d) 1 

3 d | \ 2na 2 24 d 2 J f 

nil 1 

24 V 2 ( L-d ) 2 

nhc 

L»d 24 d 2 


(3) 


Behold, the parameter a we have introduced to make sense of the divergent sum in (1) has 
disappeared in our final result for the physically measurable force. In the last step, using 
dimensional analysis we restored h to underline the quantum nature of the force. 

The Casimir force between two plates is attractive. Notice that the 1/d 2 of the force 
simply follows from dimensional analysis since in natural units force has dimension of 
an inverse length squared. In a tour de force, experimentalists have measured this tiny 
force. The fluctuating quantum field is quite real! 

To obtain a sensible result we need to regularize in the ultraviolet (namely the high 
frequency or short time behavior parametrized by a) and in the infrared (namely the long 
distance cutoff represented by L). Notice how a and L “work” together in (3). 

This calculation foreshadows the renormalization of quantum field theories, a topic 
made mysterious and almost incomprehensible in many older texts. In fact, it is perfectly 
sensible. We will discuss renormalization in detail in chapters III. 1 and III.2, but for now 
let us review what we just did. 

Instead of panicking when faced with the divergent sum in (1), we remind ourselves 
that we are proud physicists and that physics tells us that the sum should not go all the 
way to infinity. In a conducting plate, electrons rush about to counteract any applied tan¬ 
gential electric field. But when the incident wave oscillates at sufficiently high frequency, 
the electrons can’t keep up. Thus the idealization of a perfectly conducting plate fails. We 
regularize (such an ugly term but that’s what field theorists use!) the sum in a mathemati¬ 
cally convenient way by introducing a damping factor. The single parameter a is supposed 
to summarize the unknown high frequency physics that causes the electron to fail to keep 
up. In reality, a -1 is related to the plasma frequency of the metal making up the plate. 
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A priori, the Casimir force between the two plates could end up depending on the 
parameter a. In that case, the Casimir force would tell us something about the response of a 
conducting plate to high frequency electric fields, and it would have made for an interesting 
chapter in a text on solid state physics. Since this is in fact a quantum field theory text, you 
might have suspected that the Casimir force represents some fundamental physics about 
fluctuating quantum fields and that a would drop out. That the Casimir force is “universal” 
makes it unusually interesting. Notice however, as is physically sensible, that the 0(l/d 4 ) 
correction to the Casimir force does depend on whether the experimentalist used copper 
or aluminum plates. 

We might then wonder whether the leading term F — —n/{2Ad 2 ) depends on the 
particular regularization we used. What if we suppress the higher terms in the divergent 
sum with some other function? We will address this question in the appendix to this 
chapter. 

Amusingly, the 24 in (3) is the same 24 that appears in string theory! (The dimension 
of spacetime the quantum bosonic string must live in is determined to be 24 + 2 = 26.) 
The reader who knows string theory would know what these two cryptic statements are 
about (summing up the zero modes of the string). Appallingly, in an apparent attempt to 
make the subject appear even more mysterious than it is, some treatments simply assert 

OO 

that the sum Y n is by some mathematical sleight-of-hand equal to —1/12. Even though 

n =1 

it would have allowed us to wormhole from (1) to (3) instantly, this assertion is manifestly 
absurd. What we did here, however, makes physical sense. 


Appendix 


Here we address the fundamental issue of whether a physical quantity we extract by cutting off high frequency 
contributions could depend on how we cut. Let me mention that in recent years the study of Casimir force for 
actual physical situations has grown into an active area of research, but clearly my aim here is not to give a realistic 
account of this field, but to study in an easily understood context an issue (as you will see) central to quantum 
field theory. My hope is that by the time you get to actually regularize a field theory in (3 + l)-dimensional 
spacetime, you would have amply mastered the essential physics and not have to struggle with the mechanics of 
regularization. 

Let us first generalize a bit the regularization scheme we used and write 


OO 

/w=£ i S> 

n=1 



n 3 
2 da 


E* 

n =1 



(4) 


Here g(v ) = h'(v) is a rapidly decreasing function so that the sums make sense, chosen judiciously to allow 

OO 

ready evaluation of H(a/d ) = h{na/d). [In (2), we chose g(y) = e~ v and hence h{v ) = — e~ v .] We would like 

n =1 

to know how the Casimir force, 


3 fid) 
3 d 


7t d 2 a 

-(d^ L-d)= - H(-) - (d - 

2 ddda d 


L-d ) 


(5) 


depends on g(ti). 

Let us try to get as far as we can using physical arguments and dimensional analysis. Expand H as follows: 
nH{aId) = ... + y_ 2 d 2 /n 2 + y_\d/a + y 0 + y^a/d + y 1 a 1 /d 1 + • • •. We might be tempted to just write a Taylor 
series in aid, but nothing tells us that H{a/d ) might not blow up as a —> 0. Indeed, the example in (3) contains 
a term like d/a, and so we better be cautious. 
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We will presently argue physically that the series in fact terminate in one direction. The force is given by 
( 2d 1 1 2a \ 

F = [■ ■ ■ + Y-2- 3 + Y -1^ 2 +Yl^ 2 + Y 2 T3 + ■ ■ j ~ ( d ^ L ~ d ) ( 6 ) 


Look at the y_ 2 term: it contributes to the force a term like (d — (L — d))/a 3 . But as remarked earlier, the two 
outer plates could be taken as far apart as we like. The force could not depend on L, and thus on physical grounds 
y _2 must vanish. Similarly, all y_ k for k > 2 must vanish. 

Next, we note that the y_\, although definitely not zero, gets subtracted away since it does not depend on d. 
(The y 0 term has already gone away.) At this point, notice that, furthermore, the y k terms with k > 2 all vanish 
as a —> 0. You could check that all these assertions hold for the g(u) used in the text. 

Remarkably, the Casimir force is determined by y 1 alone: F = y 1 /(2J 2 ). As noted earlier, the force has to be 
proportional to 1/d 2 . This fact alone shows us that in (6) we only need to keep the y\ term. In the text, we found 
y 1 = —n/12. In exercise 1.9.1 I invite you to go through an amusing calculation obtaining the same value for y 1 
with an entirely different choice of g(v). 

This already suggests that the Casimir force is regularization independent, that it tells us more about the 
vacuum than about metallic conductivity, but still it is highly instructive to study an entire class of damping or 

regularizing functions to watch how regularization independence emerges. Let us regularize the sum over zero 

oo 

point energies to f(d) = ^ with 

n=1 


a 


A„ 


co + A^ 


(7) 


Here c a is a bunch of real numbers and A a (known as regulators or regulator frequencies) a bunch of high 

oo 

frequencies subject to certain conditions but otherwise chosen entirely at our discretion. For the sum co n K (co n ) 

n=1 

to converge, we need K ( co n ) to vanish faster than 1 /co 2 . In fact, for co much larger than A a , K {co) —> - c a^-a ~ 
■^2 c a A a 2 + • • •. The requirement that the 1/co and 1/co 2 terms vanish gives the conditions 


y~l c a A ff = 0 (8) 

a 

and 

E c « A « = ° ( 9 ) 

a 

respectively. 

Furthermore, low frequency physics is not to be modified, and so we want K(co ) — > 1 for co << A a , thus 
requiring 


E c « = 1 ( 10 ) 

a 

At this point, we do not even have to specify the set the index a runs over beyond the fact that the three conditions 
(8),(9), and (10) require that a must take on at least three values. Note also that some of the c a ’s must be negative. 
Incidentally, we could do with fewer regulators if we are willing to invoke some knowledge of metals, for instance, 
that K (co) = K(—co), but that is not the issue here. 

We now show that the Casimir force between the two plates does not depend on the choices of c a and A a . 
First, being physicists rather than mathematicians, we freely interchange the two sums in f(d) and write 


fw = 1 e c « A « E 

a n 


CO 


n 


0) n + A ff 


1 

2 


E/ C a A a E/ 


A-a 
“I - ^-(X 


( 11 ) 
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where, without further ceremony, we have used condition (8). Next, keeping in mind that the sum 
A a /(<u n + A a ) is to be put back into (11), we massage it (defining for convenience b a = n / A a ) as follows: 



(To avoid clutter we have temporarily suppressed the index a.) All these manipulations make perfect sense since 
the entire expression is to be inserted into the sum over or in (11) after we restore the index a. It appears that the 
result would depend on c a and X a . In fact, mentally restoring and inserting, we see that the 1/b term in (12) can 
be thrown away since 


E C <* A <*l b a = n J2 C “ A « = 0 ( 13 ) 

a a. 

[There is in fact a bit of an overkill here since this term corresponds to the y_\ term, which does not appear in 
the force anyway. Thus the condition (9) is, strictly speaking, not necessary. We are regularizing not merely the 
force, but f(d ) so that it defines a sensible function.] Similarly, the b° term in (12) can be thrown away thanks 
to (8). Thus, keeping only the b term in (12), we obtain 


f(d) = 


1 

24 d 



dte l t E c a A a b a + O 

a 



71 

24d 


+ o 



(14) 


Indeed, f{d), and a fortiori the Casimir force, do not depend on the c a ’s and A^’s. To the level of rigor enter¬ 
tained by physicists (but certainly not mathematicians), this amounts to a proof of regularization independence 
since with enough regulators we could approximate any (reasonable) function K {a>) that actually describes real 
conducting plates. Again, as is physically sensible, you could check that the 0{l/d 3 ) term in f(d) does depend 
on the regularization scheme. 

The reason that I did this calculation in detail is that we will encounter this class of regularization, known as 
Pauli-Villars, in chapter III.l and especially in the calculation of the anomalous magnetic moment of the electron 
in chapter 111.7, and it is instructive to see how regularization works in a more physical context before dealing 
with all the complications of relativistic field theory. 


Exercises 

l.g.i Choose the damping function g(i>) = 1/(1 + v ) 2 instead of the one in the text. Show that this re¬ 
sults in the same Casimir force. [Hint: To sum the resulting series, pass to an integral representation 

00 00 

H(£) = — V(1 + n O = — /o°° dte~^ 1+n ^ 1 = / 0 °° dte~ l /(l — e&). Note that the integral blows up 

n=1 n=1 

logarithmically near the lower limit, as expected.] 

1.9.2 Show that with the regularization used in the appendix, the 1/d expansion of the force between two 
conducting plates contains only even powers. 

1.9.3 Show off your skill in doing integrals by calculating the Casimir force in (3 + 1)-dimensional spacetime. 
For help, see M. Kardar and R. Golestanian, Rev. Mod. Phys. 71: 1233, 1999; J. Feinberg, A. Mann, and 
M. Revzen, Ann. Phys. 288: 103, 2001. 




Symmetry 


Symmetry, transformation, and invariance 


The importance of symmetry in modern physics cannot be overstated. 1 

When a law of physics does not change upon some transformation, that law is said to 
exhibit a symmetry. 

I have already used Lorentz invariance to restrict the form of an action. Lorentz invari¬ 
ance is of course a symmetry of spacetime, but fields can also transform in what is thought 
of as an internal space. Indeed, we have already seen a simple example of this. I noted in 
passing in chapter 1.3 that we could require the action of a scalar field theory to be invariant 
under the transformation p —>■ —cp and so exclude terms of odd power in <p, such as < p^, 
from the action. 

With the y> 3 term included, two mesons could scatter and go into three mesons, for 
example by the diagrams in (fig. 1.10.1). But with this term excluded, you can easily 
convince yourself that this process is no longer allowed. You will not be able to draw 
a Feynman diagram with an odd number of external lines. (Think about modifying the 
integral in our baby problem in chapter 1.7 to dqe~^ m q ~ 8q ~ Xq +Jq .) Thus the 
simple reflection symmetry q> —> — (p implies that in any scattering process the number 
of mesons is conserved modulo 2. 

Now that we understand one scalar field, let us consider a theory with two scalar fields 
(Pi and q >2 satisfying the reflection symmetry <p a —> —cp a (a = 1 or 2): 


C = Udtpj 2 - \rn\<p\ - ^-<p\ + ^(9 (p 2 ) 2 - \m 2 2 <pl - ^ cp\ - 


( 1 ) 


We have two scalar particles, call them 1 and 2, with mass wij and m 2 . To lowest order, 
they scatter in the processes 1+1—>1 + 1, 2 + 2—>2 + 2, 1 + 2 —»• 1 + 2, 1 + 1—>2 + 2, 
and 2 + 2 —> 1+1 (convince yourself). With the five parameters mj, m 2 , T-i, a 2 , and p 
completely arbitrary, there is no relationship between the two particles. 


1 A. Zee, Fearful Symmetry. 
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It is almost an article of faith among theoretical physicists, enunciated forcefully by 
Einstein among others, that the fundamental laws should be orderly and simple, rather 
than arbitrary and complicated. This orderliness is reflected in the symmetry of the action. 

Suppose that mj = m 2 and kj = X 2 ; then the two particles wouldhave the same mass and 
their interaction, with themselves and with each other, would be the same. The Lagrangian 
C becomes invariant under the interchange symmetry <p 1 <—> (p 2 . 

Next, suppose we further impose the condition p — X 1 — X 2 so that the Lagrangian 
becomes 

£ = ^ [(3<Pi) 2 + 0<P2 ) 2 ] - + ^ 2 ) “ \ {vl + Vi) ( 2 ) 

It is now invariant under the 2-dimensional rotation {q>\(x) -> cos 9 cp\(x) + sin 9 (p 2 (x), 
<p 2 (x) -> — sin 9 (Pi(x) + cos 9 <p 2 (x)} with 9 an arbitrary angle. We say that the theory 
enjoys an “internal” SO(2) symmetry, internal in the sense that the transformation has 
nothing to do with spacetime. In contrast to the interchange symmetry <pi <—>■ cp 2 the 
transformation depends on the continuous parameter 9, and the corresponding symmetry 
is said to be continuous. 

We see from this simple example that symmetries exist in hierarchies. 


Continuous symmetries 

If we stare at the equations of motion (3 2 + m 2 )(p a — —Xip 2 (p c long enough we see that if 
we define 7^ = i(cpid^(p 2 — <p 2 d ll <p\), then = i(<p-\d 2 <p 2 — (p 2 d 2 (p{) = 0 so that J ,L is a 
conserved current. The corresponding charge Q = f d D x J°, just like electric charge, is 
conserved. 

Historically, when Heisenberg noticed that the mass of the newly discovered neutron 
was almost the same as the mass of a proton, he proposed that if electromagnetism were 
somehow turned off there would be an internal symmetry transforming a proton into a 
neutron. 

An internal symmetry restricts the form of the theory, just as Lorentz invariance restricts 
the form of the theory. Generalizing our simple example, we could construct a field theory 
containing N scalar fields q> a , with a = 1, • • •, N such that the theory is invariant under the 
transformations (p a -> RabVh (repeated indices summed), where the matrix R is an element 
of the rotation group SO(N) (see appendix B for a review of group theory). The fields (p a 
transform as a vector <p — (< pi, • • •, < p N ). We can form only one basic invariant, namely the 
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c 


d 



Figure 1.10.2 


-2i\(8 ab 8 cd + 8 ac 8 bd + S od 8j 


scalar product <p ■ <p — (p a q> a = ip 2 (as always, repeated indices are summed unless otherwise 
specified). The Lagrangian is thus restricted to have the form 

c '= \ [ (9 ^ )2 “ m V ] “ ^ (l ? 2 ) 2 ( 3 ) 

The Feynman rules are given in fig. 1.10.2. When we draw Feynman diagrams, each line 
carries an internal index in addition to momentum. 

Symmetry manifests itself in physical amplitudes. For example, imagine calculating 
the propagator iD ab (x ) = / Dcpe lS (p a (x)(p b ( 0). We assume that the measure Dip is in¬ 
variant under SO(N). By thinking about how D ah (x) transforms under the symmetry 
group SO(N ) you see easily (exercise 1.10.2) that it must be proportional to S ab . You can 
check this by drawing a few Feynman diagrams or by considering an ordinary integral 
/ dqe~ s<q) q a q h . No matter how complicated a diagram you draw (fig. 1.10.3, e.g.) you al¬ 
ways get this factor of & ab . Similarly, scattering amplitudes must also exhibit the symmetry. 

Without the SO(N ) symmetry, many other terms would be possible (e.g., (p a (p b (p c (f d for 
some arbitrary choice of a, b, c, and d) in (3). 

We can write R — e e ' T where 6 ■ T = ^2 A 0 A T A is a real antisymmetric matrix. The group 
S O (N) has N(N — 1) / 2 generators, which we denote by T A . [Think of the familiar case of 
50(3).] Under an infinitesimal transformation (repeated indices summed) (p a —»■ R ab (p b — 
(1 + 9 A T A ) ah (p b , or in other words, we have the infinitesimal change 8cp a = 0 A T A ab (p b . 


Noether’s theorem 

We now come to one of the most profound observations in theoretical physics, namely 
Noether’s theorem, which states that a conserved current is associated with each generator 



Figure 1.10.3 
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of a continuous symmetry. The appearance of a conserved current for (2) is not an acci¬ 
dent. 

As is often the case with the most important theorems, the proof of Noether’s theorem is 
astonishingly simple. Denote the fields in our theory generically by q> a Since the symmetry 
is continuous, we can consider an infinitesimal change 8(p a . Since C does not change, 
we have 


0 — 8/1 —-<5<^ a +- 

Ha 


SdixVa 


SC 

Ha 


Ha + 


SC 

8S b <p a 


dn S( Pa 


(4) 


We would have been stuck at this point, but if we use the equations of motion SC/Scp a = 
dniSC/Sd^cpa) we can combine the two terms and obtain 


0=9 m 



If we define 


(5) 


J " = 


SC 


Ha 


( 6 ) 


then (5) says that 9^/^ = 0. We have found a conserved current! [It is clear from the 
derivation that the repeated index a is summed in (6)]. 

Let us immediately illustrate with the simple scalar field theory in (3). Plugging 8<p a = 
0 A (T A ) ab (p b into (6) and noting that d A is arbitrary, we obtain N(N — l)/2 conserved 
currents J A — ^^(p a {T A ) ab (p b , one for each generator of the symmetry group SO(N). 

In the special case N — 2, we can define a complex field q> = {(pi + i(p 2 )/\/2. The La- 
grangian in (3) can be written as 


C = dtp f dip — m 2 (p f ip — X((p^(p) 2 , 


and is clearly invariant under (p —>■ e' e (p and / —>• e lS ip f . We find from (6) that ./ ;J = 

i" i" 

i((p d^cp — d^q) (p ), the current we met already in chapter 1.8. Mathematically, this is 
because the groups 5(9(2) and U (1) are isomorphic (see appendix B). 

For pedagogical clarity I have used the example of scalar fields transforming as a vector 
under the group SO(N). Obviously, the preceding discussion holds for an arbitrary group 
G with the fields <p transforming under an arbitrary representation 1Z of G. The conserved 
currents are still given by J A = d b (p a (.T A ) ab (p h with T A the generators evaluated in the 
representation 1Z. For example, if (p transform as the 5-dimensional representation of 
5(9(3) then T A is a 5 by 5 matrix. 

For physics to be invariant under a group of transformations it is only necessary that the 
action be invariant. The Lagrangian density C could very well change by a total divergence: 
8C — d/jK 11 , provided that the relevant boundary term could be dropped. Then we would 
see immediately from (5) that all we have to do to obtain a formula for the conserved 
current is to modify (6) to J M = (8C/8d^(p a )8(p a — K^. As we will see in chapter VIII.4, 
many supersymmetric field theories are of this type. 
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Charge as generators 

Using the canonical formalism of chapter 1.8, we can derive an elegant result for the charge 
associated with the conserved current 

Q= f d i xJ°= f d i x ^ Sw a 

J J sd 0 cp a 

Note that Q does not depend on the time at which the integral is evaluated: 

^ = J dhdoJ 0 = - J dhdjf = 0 (7) 

Recognizing that SC/SdQ(p a is just the canonical momentum conjugate to the field <p a , we 
see that 

i[Q.<Pa\ = S <Pa ( 8 ) 

The charge operator generates the corresponding transformation on the fields. An impor¬ 
tant special case is for the complex field <p in S0(2) ~ U(\) theory we discussed; then 
[Q, ip] = <p and e ,9 ^(pe~ l6 ^ = e' d (p. 


Exercises 

1.10.1 Some authors prefer the following more elaborate formulation of Noether’s theorem. Suppose that the 
action does not change under an infinitesimal transformation 8(p a (x) = 0 A V A [with 0 A some parameters 
labeled by A and V A some function of the fields <p b (x) and possibly also of their first derivatives with 
respect to *]. It is important to emphasize that when we say the action 5 does not change we are not 
allowed to use the equations of motion. After all, the Euler-Lagrange equations of motion follow from 
demanding that SS = 0 for any variation 8cp a subject to certain boundary conditions. Our scalar field 
theory example nicely illustrates this point, which is confused in some books: 55 = 0 merely because 5 
is constructed using the scalar product of 0(N ) vectors. 

Now let us do something apparently a bit strange. Let us consider the infinitesimal change written 
above but with the parameters 0 A dependent on x. In other words, we now consider 8<p a (x) = 0 A {x)V A . 
Then of course there is no reason for 8S to vanish; but, on the other hand, we know that since 8S does 
vanish when 0 A is constant, 8S must have the form 8S = f d 4 xJ IJ, (x)d fl 0 A (x). In practice, this gives us 
a quick way of reading off the current J^(x)\ it is just the coefficient of d^6 A (x) in 8S. 

Show how all this works for the Lagrangian in (3). 

1.10.2 Show that D ab (x) must be proportional to 8 ab as stated in the text. 

1.10.3 Write the Lagrangian for an 50(3) invariant theory containing a Lorentz scalar field (p transforming 
in the 5-dimensional representation up to quartic terms. [Hint: It is convenient to write ^ as a 3 by 3 
symmetric traceless matrix.] 

1.10.4 Add a Lorentz scalar field r/ transforming as a vector under 50(3) to the Lagrangian in exercise 1.10.3, 
maintaining 50(3) invariance. Determine the Noether currents in this theory. Using the equations of 
motion, check that the currents are conserved. 



Field Theory in Curved Spacetime 


I . I 


I 


General coordinate transformation 


In Einstein’s theory of gravity, the invariant Minkowskian spacetime interval ds 2 = 
T] llv dx fl dx v — (dt) 2 — (dx) 2 is replaced by ds 2 — g flv dx ,l dx v , where the metric tensor 
g^vix.) is a function of the spacetime coordinates x. The guiding principle, known as the 
principle of general covariance, states that physics, as embodied in the action S, must be 
invariant under arbitrary coordinate transformations x —> x'(x). More precisely, the prin¬ 
ciple 1 states that with suitable restrictions the effect of a gravitational field is equivalent to 
that of a coordinate transformation. 

Since 

ds 2 = g[ dx' x dx' a = g[ — - 1 — dx^dx" = g^dx^dx" 

la dx^ dx" M 

the metric transforms as 




dx a dx ,<T 
dxP dx" 


= g^vix) 


( 1 ) 


The inverse of the metric g^ v is defined by g^ v g vp = 8£. 

A scalar field by its very name does not transform: cp(x) — <p'(x The gradient of the 
scalar field transforms as 


d^(x) = 


dx' x d<p'(x') 
dxP dx a 


dx 
dxP 


dy(x') 


By definition, a (covariant) vector field transforms as 


dx' 1 

{x)=—A'{x’) 

ex' 1 k 


1 For a precise statement of the principle of general covariance, see S. Weinberg, Gravitation and Cosmology, 
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so that d^cpix) is a vector field. Given two vector fields A^(x) and B v (x), we can contract 
them with g^ v (x) to form g IJ ‘ v (x)A fl (x)B v (x), which, as you can immediately check, is 
a scalar. In particular, g ,lv (x)d tl q>(x)d v ^p(x) is a scalar. Thus, if we simply replace the 
Minkowski metric r]^ v in the Lagrangian C — j[(3 <p) 2 — m 2 (p 2 ] — j(ri IJ ' v d ll <pd v (p — m 2 (p 2 ) by 
the Einstein metric g ^ v , the Lagrangian is invariant under coordinate transformation. 

The action is obtained by integrating the Lagrangian over spacetime. Under a coordinate 
transformation d A x' — d A x &eX{dx'/dx). Taking the determinant of (1), we have 


det 



( 2 ) 


We see that the combination d A x^/—g — d^x'^J—g' is invariant under coordinate transfor¬ 
mation. 

Thus, given a quantum field theory we can immediately write down the theory in curved 
spacetime. All we have to do is promote the Minkowski metric i] llv in our Lagrangian to the 
Einstein metric g^ v and include a factor -J—g in the spacetime integration measure. 2 The 
action S would then be invariant under arbitrary coordinate transformations. For example, 
the action for a scalar field in curved spacetime is simply 

S = J ~ mV) (3) 

(There is a slight subtlety involving the spin \ field that we will talk about in part II. We 
will eventually come to it in chapter VI11.1.) 

There is no essential difficulty in quantizing the scalar field in curved spacetime. We 
simply treat g^ v as given [e.g., the Schwarzschild metric in spherical coordinates: g 00 = 
(1 — 2 GM/r), g rr — —(1 — 2GM/r) -1 , g 99 — —r 2 , and g^ — —r 2 sin 2 Q ] and study the 
path integral f Dipe' s , which is still a Gaussian integral and thus do-able. The propagator 
of the scalar field D(x, y) can be worked out and so on and so forth, (see exercise 1.11.1). 

At this point, aside from the fact that g^ix) carries Lorentz indices while cp(x) does not, 
the metric g^ v looks just like a field and is in fact a classical field. Write the action of the 
world S — S g + S M as the sum of two terms: S g describing the dynamics of the gravitational 
field g^ v (which we will come to in chapter VIII.1) and S M describing the dynamics of all 
the other fields in the world [the “matter fields,” namely <p in our simple example with S M 
as given in (3)]. We could quantize gravity by integrating over g /lv as well, thus extending 
the path integral to / DgD(pe lS . 

Easier said than done! As you have surely heard, all attempts to integrate over g^ lv (x) 
have been beset with difficulties, eventually driving theorists to seek solace in string theory. 
I will explain in due time why Einstein’s theory is known as “nonrenormalizable.” 


2 We also have to replace ordinary derivatives d j: by the covariant derivatives l) jt of general relativity, but acting 
on a scalar field ip the covariant derivative is just the ordinary derivative D^ip = d^cp. 
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What the graviton listens to 


One of the most profound results to come out of Einstein’s theory of gravity is a funda¬ 
mental definition of energy and momentum. What exactly are energy and momentum any 
way? Energy and momentum are what the graviton listens to. (The graviton is of course 
the particle associated with the field g llv .) 

The stress-energy tensor T flv is defined as the variation of the matter action with 
respect to the metric g pv (holding the coordinates x p fixed): 


T pv (x) = 


2 SS M 
V 3 # s g U v(x) 


(4) 


Energy is defined as E — P° = f d 2 x^/—gT m (x) and momentum as P' — Jd^x^Z—gT 0 ' {x). 

Even if we are not interested in curved spacetime per se, (4) still offers us a simple (and 
fundamental) way to determine the stress energy of a field theory in flat spacetime. We 
simply vary around the Minkowski metric r]^ v by writing g^ v = r] /lv + h pv and expand S M 
to first order in h. According to (4), we have 3 

S M (h) = S M (h = 0 ) - j d 4 x [ih^T^ + 0 (/ 7 2 * * )] . ( 5 ) 


The symmetric tensor field h pv (x) is in fact the graviton field (see chapters 1.5 and 
VIII.1). The stress-energy tensor T pv (x) is what the graviton field couples to, just as the 
electromagnetic current J ll (x) is what the photon field couples to. 

Consider a general S M = f d 4 xJ^g(A + g pv B pv + g pv g Xp C pvkp H-). Since -g = 

1 + r] flv h flv + 0(h 2 ) and g pv — r) pv — h pv + 0(h 2 ), we find 


— 2 + 2C pvkp r] Xp + •••) — 


( 6 ) 


in flat spacetime. Note 


T = »T 7^ = —(4A + 2+ 0 • + ■ • •) (7) 

which we have written in a form emphasizing that C flvkp does not contribute to the trace 
of the stress-energy tensor. 

We now show the power of this definition of T pv by obtaining long-familiar results 
about the electromagnetic field. Promoting the Lagrangian of the massive spin 1 field 
to curved spacetime we have 4 C — (—\g ,lv g kp F p} F vp + \m 2 g flv A pL A v ) and thus 5 T pv = 
~ F ^F v k + m 2 A^A v - 


3 I use the normal convention in which indices are summed regardless of any symmetry; in other words, 
\h flv T^=\(h m T m +h 10 T w +---) = hoiT m +.-.. 

4 Here we use the fact that the covariant curl is equal to the ordinary curl D^A V — D V A^ = d^A v — d v A /x and 
so F^ v does not involve the metric. 

5 Holding x^ fixed means that we hold 3^ and hence A^ fixed since A^ is related to 3^ by gauge invariance. 

We are anticipating (see chapter II.7) here, but you have surely heard of gauge invariance in a nonrelativistic 

context. 
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For the electromagnetic field we set m = 0. First, C — —\ F flv F flv = — \ (—2F^ + F 2 ) — 
\(E 2 — B 2 ). Thus, 7 00 = — F 0X F^ — \(E 2 — B 2 ) — \(E 2 + B 2 ). That was comforting, to 
see a result we knew from “childhood.” Incidentally, this also makes clear that we can 
think of E 2 as kinetic energy and B 2 as potential energy. Next, Toi — —Fox.F^ — F 0 jFij — 
Sij k EjB k — (Ex B)j. The Poynting vector has just emerged. 

Since the Maxwell Lagrangian C— — \g^ v g Xp F pX F vp involves only the C term with 
C pvXp — —\Fp V F Xp , we see from (7) that the stress-energy tensor of the electromagnetic 
field is traceless, an important fact we will need in chapter VIII.1. We can of course check 
directly that T — 0 (exercise 1.11.4). 6 


Appendix: A concise introduction to curved spacetime 


General relativity is often made to seem more difficult and mysterious than need be. Here I give a concise review 
of some of its basic elements for later use. 

Denote the spacetime coordinates of a point particle by . To construct its action note that the only 
coordinate invariant quantity is the “length” 7 of the world line traced out by the particle (fig. 1.11.1), namely 
J ds = f y /g flv dX tJ -dX v , where g^ v is evaluated at the position of the particle of course. Thus, the action for a 
point particle must be proportional to 

J ds = j Jg^X^dX” - J Jg^Xitrf-f^^dt 


where £ is any parameter that varies monotonically along the world line. The length, being geometric, is 
manifestly reparametrization invariant, that is, independent of our choice of £ as long as it is reasonable. This 
is one of those “more obvious than obvious” facts since f y /g lxv dX> x dX v is manifestly independent of £. If we 
insist, we can check the reparametrization invariance of f ds. Obviously the powers of dt, match. Explicitly, if 
we write £ = £(??), then dX^/dt; = (dr\ // dr]) and dt, = ( d^/dr])dr ]. 

Let us define 


K^gpAxo;)] 


dX»dX v 
dt dt 


for ease of writing. Setting the variation of f dt;-J K equal to zero, we obtain 


/ 


1 


dX» dSX v 


d? VK {28llv dt dt 


+ d\8tx 


dX* dX v , 


-SX A ) = 0 


dt dt 

which upon integration by parts (and with SX x = 0 at the endpoints as usual) gives the equation of motion 


r- d ( 1 dX^\ dX^ dX v 

^K— -=2g ul — - d,g„ v — -— = 0 


dt WK 28 ^~dt J dt dt 

To simplify (8) we exploit our freedom in choosing £ and set dt; =ds, so that K = 1. We have 

„ d 2 X* _ dX° dX^ 0 dX ! 1 dX v „ 

^8/iX , 2 ^ ^cr8/j,X 7~ 7" ^x8/j.v i i 0 


( 8 ) 


ds 2 


ds ds 


ds ds 


6 We see that tracelessness is related to the fact that the electromagnetic field has no mass scale. Pure 
electromagnetism is said to be scale or dilatation invariant. For more on dilatation invariance see S. Coleman, 
Aspects of Symmetry , p. 67. 

7 We put “length” in quotes because if g^ v had a Euclidean signature then f ds would indeed be the length 
and minimizing f ds would give the shortest path (the geodesic) between the endpoints, but here g^ v has a 
Minkowskian signature. 
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which upon multiplication by becomes 


d 2 X p 
ds 2 


+ 


g pk (2d v gnk - 9 *<W 


dXP dX v 
ds ds 


= 0 


that is, 
d 2 X p 


+ r p u V m\—— = 0 


ds 2 MVl ' J ds ds 
if we define the Riemann-Christoffel symbol by 


(9) 


Kv = - 9^ mv ) (10) 

Given the initial position X^(5 0 ) and velocity (dX 11 /ds)(s q) we have four second order differential equations (9) 
determining the geodesic followed by the particle in curved spacetime. Note that, contrary to the impression 
given by some texts, unlike (8), (9) is not reparametrization invariant. 

To recover Newtonian gravity, three conditions must be met: (1) the particle moves slowly dX l /ds <^C dX°/ds; 
(2) the gravitational field is weak, so that the metric is almost Minkowskian g^ v — r]^ v + h^ v ; and (3) the 
gravitational field does not depend on time. Condition (1) means that d 2 X p /ds 2 + T^dX® /ds) 2 — 0, while (2) 
and (3) imply that — *l pX d),h 00 . Thus, (9) reduces to d 2 X°/ds 2 ~ 0 (which implies that dX°/ds is a constant) 
and d 2 X l /ds 2 + ^dih^dX®/ds) 2 ~ 0, which since X° is proportional to s becomes d 2 X l /dt 2 — — ^3,-^oo- Thus, 
we obtain Newton’s equation — — V0 if we define the gravitational potential 0 by Hqq = 20: 


#00 — 1 + 20 


( 11 ) 


Referring to the Schwarzschild metric, we see that far from a massive body, 0 = —GM/r, as we expect. (Note 
also that this derivation depends neither on hjj nor on h^j , as long as they are time independent.) 

Thus, the action of a point particle is 


/ ,- r / dX^ dX v 

y 8fivdX^dX v = —ra J Jg MV [X(f)]—■— d( (12) 

The m follows from dimensional analysis. 

A slick way of deriving S (which also allows us to see the minus sign) is to start with the nonrelativistic action 
of a particle in a gravitational potential 0, namely S = f Ldt = f (\mv 2 — m — m^dt. Note that the rest mass 
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m comes in with a minus sign as it is part of the potential energy in nonrelativistic physics. Now force S into a 
relativistic form: For small v and </>, 


S = —m 


b-l 


- v + (p)dt — —m 


J \/l — v 2 + 2<pdt 


J 7(1+2 <P)(dt) 2 - (dx) 2 


We see 8 that the 2 in (11) comes from the square root in the Lorentz-Fitzgerald quantity Vl — v 2 . 
Now that we have S we can calculate the stress energy of a point particle using (4): 


T^(x)=-^— f dt;K~h i4) [x - X(0] 
J 


dX^ L dX v 
dt, di' 


Setting f to i (which we call the proper time r in this context) we have 


T^ix) = 


7=2 


J drS w [. 


x - X(t)\ 


dXVdX v 


dr dr 

In particular, as we expect the 4-momentum of the particle is given by 

,dX°dX v _ dX v 
dr dr dr 


p" = J d*x*/=gT 0v = m J dTS[x° - X°(t)]- - =m- 


The action (12) given here has two defects: (1) it is difficult to deal with a path integral 
f DXe~' m f d W( dx ‘ 1 / d ^ (dx ii/ d ( ) involving a square root, and (2) S does not make sense for a massless particle. 
To remedy these defects, note that classically, S is equivalent to 


c. __ 

^lmp 2 


in 


1 dX* 1 dx 

-1- yin 

y dt; dt, 


(13) 


where (dX^/di;)(dX^/dt;) = g^ v (X)(dX fl /dt;)(dX v /d£). Varying with respect to y(£) we obtain m 2 y 2 = 
(dX^ / d£)(dXp/d £). Eliminating y in S^p we recover S. 

The path integral f DXe lSim p has a standard quadratic form. 9 Quantum mechanics of a relativistic point parti¬ 
cle is best formulated in terms of 5i mp , not S. Furthermore, for m = 0, p = — \ f d^[y~ l {dX ,i '/d^)(dX^/dC,)] 
makes perfect sense. Note that varying with respect to y now gives the well-known fact that for a massless particle 
gflv (X)dX»dX v =0. 

The action p will provide the starting point for our discussion on string theory in chapter VI11.5. 


Exercises 

l.ii.i Integrate by parts to obtain for the scalar field action 

s = - J d 4 xJ=ghp(~n_d ll J=]; g i , - v d v + m 2 )(p 

8 We should not conclude from this that gi j = 8^ The point is that to leading order in v/c, our particle is 
sensitive only to g 00 , as we have just shown. Indeed, restoring c in the Schwarzschild metric we have 

ds 2 = (1 — ^^- )c 2 dt 2 — (1 — ^^- )~ l dr 2 — r 2 d0 2 — r 2 sin 2 0d(j) 2 

c 2 r c 2 r 

-+C 2 dt 2 -dx 2 -— dt 2+0{l /C 2 ) 

r 

9 One technical problem, which we will address in chapter III.4, is that in the integral over X(£) apparently 
different functions X(£) may in fact be the same physically, related by a reparametrization. 
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and write the equation of motion for (p in curved spacetime. Discuss the propagator of the scalar field 
D(x, y) (which is of course no longer translation invariant, i.e., it is no longer a function of x — y). 


I.n.2 Use (4) to find E for a scalar field theory in flat spacetime. Show that the result agrees with what you 
would obtain using the canonical formalism of chapter 1.8. 

I.i 1.3 Show that in flat spacetime as derived here from the stress energy tensor T when interpreted as 
an operator in the canonical formalism satisfies [P^, <p(x)\ = —id lJ '(p{x) t and thus does exactly what you 
expect the energy and momentum operators to do, namely to be conjugate to time and space and hence 
represented by —i 3^. 

I.n.4 Show that for the Maxwell field + B t Bj ) + \&ij(E 2 + B 2 ) and hence T = 0. 




Field Theory Redux 


What have you learned so far? 

Now that we have reached the end of part I, let us take stock of what you have learned. 
Quantum field theory is not that difficult; it just consists of doing one great big integral 

Z(J) = J D(p g 1 1 dD+lx [l^V) 2 -2''>V-^+J<P} (!) 

By repeatedly functionally differentiating Z(J) and then setting J — 0 we obtain 

f D V <p(xj<p(x 2 ) ■ ■ ■ 9 {x n )e l f * D+1 h^) 2 -b"V-V] (2) 

which tells us about the amplitude for n particles associated with the field <p to come into 
and go out of existence at the spacetime points x\, x 2 , • • •, x n , interacting with each other 
in between. Birth and death, with some kind of life in between. 

Ah, if we could only do the integral in (1)! But we can’t. So one way of going about it is 
to evaluate the integral as a series in X : 

( ^ \ D( P <P( x i)v(x 2 ) ■ ■ ■ <p(x n )\ f d D+1 y<p(y) 4 fe‘f dD+lxl ’i {3,p)2 ~’i ml,pl] 
k =o k - J J 

(3) 

To keep track of the terms in the series we draw little diagrams. 

Quantum field theorists try to dream up ways to evaluate (1), and failing that, they invent 
tricks and methods for extracting the physics they are interested in, by hook and by crook, 
without actually evaluating (1). 

To see that quantum field theory is a straightforward generalization of quantum me¬ 
chanics, look at how (1) reduces appropriately. We have written the theory in (D + 1)- 
dimensional spacetime, that is, D spatial dimensions and 1 temporal dimension. Consider 
(1) in (0 + 1)-dimensional spacetime, that is, no space; it becomes 

Z(J) = j D<P e‘f d ' 1 ^ > 2 -imy-V+A>] 


(4) 
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where we now denote the spacetime coordinate x just by time t. We recognize this as the 
quantum mechanics of an anharmonic oscillator with the position of the mass point tied 
to the spring denoted by (p and with an external force J pushing on the oscillator. 

In the quantum field theory (1), each term in the action makes physical sense: The first 
two terms generalize the harmonic oscillator to include spatial variations, the third term 
the anharmonicity, and the last term an external probe. You can think of a quantum field 
theory as an infinite collection of anharmonic oscillators, one at each point in space. 

We have here a scalar field <p. In previous and future chapters, the notion of field was 
and will be generalized ever so slightly: The field can transform according to a nontrivial 
representation of the Lorentz group. We have already encountered fields transforming as 
a vector and a tensor and will presently encounter a field transforming as a spinor. Lorentz 
invariance and whatever other symmetries we have constrain the form of the action. The 
integral will look more complicated but the approach is exactly as outlined here. 

That’s just about all there is to quantum field theory. 
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The Dirac Equation 


Staring into a fire 

According to a physics legend, apparently even true, Dirac was staring into a fire one 
evening in 1928 when he realized that he wanted, for reasons that are no longer relevant, 
a relativistic wave equation linear in spacetime derivatives 3^ = d/dx^. At that time, the 
Klein-Gordon equation (3 2 + m 2 )q> = 0, which describes a free particle of mass m and 
quadratic in spacetime derivatives, was already well known. This is in fact the equation of 
motion of the scalar field theory we studied earlier. 

At first sight, what Dirac wanted does not make sense. The equation is supposed to 
have the form “some linear combination of 3^ acting on some field \jj is equal to some 
constant times the field.” Denote the linear combination by c^d^. If the c^’s are four 
ordinary numbers, then the four-vector defines some direction and the equation cannot 
be Lorentz invariant. 

Nevertheless, let us follow Dirac and write, using modern notation, 

= 0 ( 1 ) 

At this point, the four quantities iy M are just the coefficients of 3 /( and m is just a constant. 
We have already argued that y /l cannot simply be four numbers. Well, let us see what these 
objects have to be in order for this equation to contain the correct physics. 

Acting on (1) with (iy^ 3^ + m), Dirac obtained — (y M y v d M d v + m 2 )\lr — 0. It is tradi¬ 
tional to define, in addition to the commutator [A, B] — AB — BA familiar from quan¬ 
tum mechanics, the anticommutator {A, B] — AB + BA. Since derivatives commute, 
y A 'y y 3 /t 3 y = j{ y^, y v }d^d v , and we have (j{ y v }d )l d v + m 2 )^ = 0. In a moment of 
inspiration Dirac realized that if 

{y'h y v ] = 2 ,r (2) 

with r] lu ’ the Minkowski metric he would obtain (3 2 + m 2 ) 1 ^ = 0, which describes a particle 
of mass m, and thus (1) would also describe a particle of mass m. 
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Since rj IJ ' v is a diagonal matrix with diagonal elements if 0 = 1 and rfi = — 1, (2) says 
that (y 0 ) 2 = 1, (y- 7 ) 2 = —1, and y^y” = — y l ’y 7 ' for /x ^ v. This last statement, that the 
coefficients y ^ anticommute with each other, implies that they indeed cannot be ordinary 
numbers. Dirac’s thought would make sense if we could find four such objects. 


Clifford algebra 


A set of objects y 7 ' (clearly d of them in d-dimensional spacetime) satisfying the relation 
(2) is said to form a Clifford algebra. I will develop the mathematics of Clifford algebra 
later. Suffice it for you to check here that the following 4 by 4 matrices satisfy (2): 



I)-’ 1 ® 1 ’* < 4 > 


Here cr and r denote the standard Pauli matrices. For historical reasons the four matrices 
y 71 are known as gamma matrices—not a very imaginative name! (Our convention is such 
that whether an index on a Pauli matrix is upper or lower has no significance. On the other 
hand, we define y^ = r] flv y v and it does matter whether the index on a gamma matrix is 
upper or lower; it is to be treated just like the index on any Lorentz vector. This convention 
is useful because then y 7t 3^ = y |Lt 9 / h) 

The direct product notation is convenient for computation: For example, y'y J = (a 1 ® 
/r 2 )(er 7 <g> i r 2 ) = (er'or 7 <g> i 2 r 2 r 2 ) = — (er'er 7 <g> I) and thus (y ! , y- 7 } = —(cr', er 7 } <g> I — 
—28'i as desired. 

You can convince yourself that the y 7t ’s cannot be smaller than 4 by 4 matrices. The 
mathematics forces the Dirac spinor x/s to have 4 components! The physical content of 
the Dirac equation (1) is most transparent if we transform to momentum space: we plug 
xjr{x) = / [d 4 p/(2jt) 4 ]e~ ,px x//(p) into (1) and obtain 


(y V/i - m)f(p) = 0 (5) 

Since (5) is Lorentz invariant, as we will show below, we can examine its physical content 
in any frame, in particular the rest frame p 7t = ( m , 0), in which it becomes 


(y° - m = 0 


( 6 ) 


As (y° — l) 2 = —2(y 0 — 1) we recognize (y 0 — 1) as a projection operator up to a trivial nor¬ 
malization. Indeed, using the explicit form in (3), we see that there is nothing mysterious 
to Dirac’s equation: When written out, (6) reads 


( 


xjf = 0 


thus telling us that 2 of the 4 components in xji are zero. 
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This makes perfect sense since we know that the electron has 2 physical degrees of 
freedom, not 4. Viewed in this light, the mysterious Dirac equation is no more and no less 
than a projection that gets rid of the unwanted degrees of freedom. Compare our discussion 
of the equation of motion of a massive spin 1 particle (chapter 1.5). There also, 1 of the 4 
components of A p is projected out. Indeed, the Klein-Gordon equation (d 2 + m 2 )<p(x) — 0 
just projects out those Fourier components q>{k) not satisfying the mass shell condition 
k 2 = m 2 . Our discussion provides a unified view of the equations of motion in relativistic 
physics: They just project out the unphysical components. 

A convenient notation introduced by Feynman, </l = y^a^ for any 4-vector a fJL , is now 
standard. The Dirac equation then reads (i$ — m)xjf = 0. 


Cousins of the gamma matrices 


Under a Lorentz transformation x' v = A v ^x fl , the 4 components of the vector field A 
transform like, well, a vector. How do the 4 components of x/r transform? Surely not in the 
same way as A fl since even under rotation x[r and A /t transform quite differently: one as 
spin \ and the other as spin 1. Let us write 1 jr(x) —*■ xjr'(x') = S(A)x/r(x) and try to determine 
the 4 by 4 matrix S’(A). 

It is a good idea to first sort out (and name) the 16 linearly independent 4 by 4 matrices. 
We already know five of them: the identity matrix and the y^’s. The strategy is simply 
to multiply the y p ’s together, thus generating more 4 by 4 matrices until we get all 16. 
Since the square of a gamma matrix y M is equal to ±1 and the y p ’s anticommute with 
each other, we have to consider only y fl y v , y p y v y A , and y^y v y x y p with /x, v, X, and p all 
different from one another. Thus, the only product of four gamma matrices that we have 
to consider is 


y s = iy°y 1 y 2 y 3 (7) 

This combination is so important that it has its own name! (The peculiar name comes 
about because in some old-fashioned notation the time coordinate was called x 4 with a 
corresponding y 4 .) We have 

y 5 = i(I ® r 3 ) (cr 1 ® !T 2 )(cr 2 ® ir 2 )(cr i ® i r 2 ) = i 4 (I ® r 3 )(a 1 cr 2 a i ® r 2 ) 


and so 


y = I <g> t x = 


0 I 
I 0 


( 8 ) 


With the factor of i included, y 5 is manifestly hermitean. An important property is that y 5 
anticommutes with the y p ’s: 


(y\y M } = 0 


(9) 
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Continuing, we see that the products of three gamma matrices, all different, can be 
written as y 7 'y 5 (e.g., y 1 y 2 y i = — iyV 5 ). Finally, using (2) we can write the product of 
two gamma matrices as y^y v = ?/“’ — icr^ v , where 


a ^='-[y»,y'’] ( 10 ) 

There are 4 ■ 3/2 = 6 of these cr ,iv matrices. 

Count them, we got all 16. The set of 16 matrices {1, y^, , y^y 5 , y 5 } forms a 

complete basis of the space of all 4 by 4 matrices, that is, any 4 by 4 matrix can be written 
as a linear combination of these 16 matrices. 

It is instructive to write out explicitly in the representation (3) and (4): 




Jjk 




( 11 ) 

( 12 ) 


We see that <x lJ are just the Pauli matrices doubly stacked, for example, 
/ cr 3 0 

a 12 = 

\0 ff 3 


Lorentz transformation 

Recall from a course on quantum mechanics that a general rotation can be written as 
e'^with J the 3 generators of rotation and 9 3 rotation parameters. Recall also that the 
Lorentz group contains boosts in addition to rotations, with K denoting the 3 generators of 
boosts. Recall from a course on electromagnetism that the 6 generators {/, K] transform 
under the Lorentz group as the components of an antisymmetric tensor just like the 
electromagnetic field F ;lv and thus can be denoted by J fjLV . I will discuss these matters 
in more detail in chapter 11.3. For the moment, suffice it to note that with this notation 
we can write a Lorentz transformation as A = e ~ 2 u, ^ J ' 1 t with J'j generating rotations, 
J 0 ' generating boosts, and the antisymmetric tensor &> /u , = —a) V|Li with its 6 = 4 • 3/2 
components corresponding to the 3 rotation and 3 boost parameters. 

Given the preceding discussion and the fact that there are six matrices , we suspect 
that up to an overall numerical factor the er /u ”s must represent the 6 generators J^ v of the 
Lorentz group acting on a spinor. In fact, our suspicion is confirmed by thinking about 
what a rotation 1 does. Referring to (12) we see that if Z'- 7 is represented by 

this would correspond exactly to how a spin \ particle transforms in quantum mechanics. 
More precisely, separate the 4 components of the Dirac spinor into 2 sets of 2 components: 

/</> 

f 


/ 


(13) 
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From (12) we see that under a rotation around the 3rd axis, 4> —> e~ la>12 T- a 4> and x ->■ 
e~ lu>n 2 a y it j s gratifying to see that </> and / transform like 2-component Pauli spinors. 

We have thus figured out that a Lorentz transformation A acting on 1 p is represented 
by S( A) = e _ ('/ 4 ) £0 /iv cr ' i ( anc i S0; acting on xp, the generators J^ v are indeed represented 
by \a^ v . Therefore we would expect that if \p(x) satisfies the Dirac equation (1) then 
1 p'(x') = S(A)\p(x) would satisfy the Dirac equation in the primed frame, 

(iy^d^-m) tp'(x') = 0 (14) 

where d' = 3/3x ,A '. To show this, calculate [a ^ v , y x ] = 2i(y fl r] vX — y v r] IJ ’ x ) and hence 
for w infinitesimal Sy^S^ 1 — y x — (i/A)a> llv [cr IJjV , y A ] — y x + y^co^. Building up a finite 
Lorentz transformation by compounding infinitesimal transformations (just as in the 
standard discussion of the rotation group in quantum mechanics), we have = 

A 4 y^. 


Dirac bilinears 

The Clifford algebra tells us that (y 0 ) 2 — +1 and (y 1 ) 2 = —1; hence the necessity for the i 
in (4). One consequence of the i is that y° is hermitean while y' is antihermitean, a fact 
conveniently expressed as 

(y M ) t = yW (15) 

Thus, contrary to what you might think, the bilinear i// 1 y ll \p is not hermitean; rather, 
ipy^xp is hermitean with ip = 1 p f y°. The necessity for introducing 1 p in addition to \p' f in 
relativistic physics is traced back to the (+, —, —, —) signature of the Minkowski metric. 

It follows that ((T ,iv )"* = y°<j /J ' v y 0 . Hence, S(A)"f = y° e <, / 4 > a> ^ crfu y° > (which incidentally, 
clearly shows that S is not unitary, a fact we knew since er 0 f is not hermitean), and so 

y V) = V f D) t 5 , (A)ly° = f( x )e +ii/4)m ^ altv . (16) 


We have 

ip\x')xp\x') = TK-Oe +( ' /4) ^ v ^V ( ' V4) ®^' T ' iv xp(x) = ip(x)\p(x) 

You are probably used to writing \p'xp in nonrelativistic physics. In relativistic physics you 
have to get used to writing \p\p. It is \p\p, not xp'ip, that transforms as a Lorentz scalar. 

There are obviously 16 Dirac bilinears ip T 1 p that we can form, corresponding to the 16 
linearly independent T’s. You can now work out how various fermion bilinears transform 
(exercise II. 1.1). The notation is rather nice: Various objects transform the way it looks like 
they should transform. We simply look at the Lorentz indices they carry. Thus , ip(x)y ll xp(x) 
transforms as a Lorentz vector. 
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Parity 

An important discrete symmetry in physics is that of parity or reflection in a mirror 1 

x 11 x' 11 = (x°, -x) (17) 

Multiply the Dirac equation (1) by y° : y°(iy ll d ll — m)t/r(x) — 0 = — m)y°\l/(x), 

where 3' = 3/3.v ,At . Thus, 

(is) 

satisfies the Dirac equation in the space-reflected world (where ;/ is an arbitrary phase that 
we can set to 1). 

Note, for example, (x')^r' (x') — ir(x)i/r (x) but \fr'(x')y s xlr'(x') — \j/(x)y°y s y°\jr{x) = 
—'0 r (jc)y 5 t/r(jc). Under a Lorentz transformation x[r(x)y i xlr(x) and 1 jr(x)i/r(x) transform in 
the same way but under space reflection they transform in an opposite way; in other words, 
while \[r(x)xlr(x) transforms as a scalar, 1 Jr(x)y 5 xlr(x) transforms as a pseudoscalar. 

You are now ready to do the all-important exercises in this chapter. 


The Dirac Lagrangian 

An interesting question: What Lagrangian would give Dirac’s equation? The answer is 

C = fW-m)f (19) 

Since t/r is complex we can vary 1 jr and 1 Jr independently to obtain the Euler-Lagrange 
equation of motion. Thus, d^SC/Sd^ir) — 8C/8x(r — 0 gives d^i^ry 1 *) +m\[f — 0, which 
upon hermitean conjugation and multiplication by y° gives the Dirac equation (1). The 
other variational equation d^iSC/Sd^ij/) — SC/S-ifr = 0 gives the Dirac equation even more 
directly. (If you are disturbed by the asymmetric treatment of 1 // and , you can always 
integrate by parts in the action, have 3^ act on ijr in the Lagrangian, and then average the 
two forms of the Lagrangian. The action S — f d A xC treats x[r and x[r symmetrically.) 


Slow and fast electrons 

Given a set of gamma matrices it is straightforward to solve the Dirac equation 

= 0 ( 20 ) 
for ir(p): It is a simple matrix equation (see exercise II.1.3). 

1 Rotations consist of all linear transformations x‘ —> R 1J x 1 such that det R = +1. Those transformations with 
det R = — 1 are composed of parity followed by a rotation. In (3 + 1)-dimensional spacetime, parity can be defined 
as reversing one of the spatial coordinates or all three spatial coordinates. The two operations are related by a 
rotation. Note that in odd dimensional spacetime, parity is not the same as space inversion, in which all spatial 
coordinates are reversed (see exercise 11.1.12). 
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Note that if somebody uses the gamma matrices y At , you are free to use instead y = 
W~ x y^ W with W any 4 by 4 matrix with an inverse. Obviously, y ,fl also satisfy the Clifford 
algebra. This freedom of choice corresponds to a simple change of basis. Physics cannot 
depend on the choice of basis, but which basis is the most convenient depends on the 
physics. 

For example, suppose we want to study a slowly moving electron. Let us use the basis 
defined by (3) and (4), and the 2-component decomposition of 1 jr (13). Since (6) tells us 
that x(p) — 0 for an electron at rest, we expect y (p) to be much smaller than (p(p) for a 
slowly moving electron. 

In contrast, for momentum much larger than the mass, we can approximate (20) by 
f>\lr(p) — 0. Multiplying on the left by y 5 , we see that if 1 j/(p) is a solution then y 5 \[r(p) 
is also a solution since y 5 anticommutes with y^. Since (y 5 ) 2 = 1, we can form two 
projection operators P L = \{1 — y 5 ) and P R = \(1 + y 5 ), satisfying T* 2 = P L , P^ = P R , 
and P R P R — 0. It is extremely useful to introduce the two combinations ^r L — 2 (1 — Y 5 )4 r 
and \j/ R = §(1 + V 5 )f- Note that y 5 i/f L = —Vt and y 5 ^ = +Vt?- Physically, a relativistic 
electron has two degrees of freedom known as helicities: it can spin either clockwise or 
anticlockwise around the direction of motion. I leave it to you as an exercise to show that 
\f/ L and \[r R correspond to precisely these two possibilities. The subscripts L and R indicate 
left and right handed. Thus, for fast moving electrons, a basis known as the Weyl basis, 
designed so that y 5 , rather than y°, is diagonal, is more convenient. Instead of (3), we 
choose 


y 


0 


0 

/ 


= I ® T 1 


( 21 ) 


We keep y l as in (4). This defines the Weyl basis. We now calculate 
y 5 = iy°y 1 y 2 y 3 = i(I <g> rfK crW <g> i 3 t 2 ) = -(/ <g> r 3 ) = ( ^ 


( 22 ) 


which is indeed diagonal as desired. The decomposition into left and right handed fields is 
of course defined regardless of what basis we feel like using, but in the Weyl basis we have 
the nice feature that 1 j/ L has two upper components and 1 fr R has two lower components. 
The spinors t/'l and 1 / r R are known as Weyl spinors. 

Note that in going from the Dirac to the Weyl basis y° and y 5 trade places (up to a sign): 


Dirac : y°diagonal; Weyl: y 5 diagonal. (23) 

Physics dictates which basis to use: We prefer to have y° diagonal when we deal with 
slowly moving spin \ particles, while we prefer to have y 5 diagonal when we deal with fast 
moving spin \ particles . 

I note in passing that if we define er^ = (7, a) and = (/, —a) we can write 
0 a" 
a 11 0 

more compactly in the Weyl basis. (We develop this further in appendix E.) 
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Chirality or handedness 

Regardless of whether a Dirac field i j/(x) is massive or massless, it is enormously useful to 
decompose \/f into left and right handed fields i jr{x) = xjr L {x) + x/s R (x) = \{l — y 5 )is(x) + 
j(l + y 5 )i/c( jc). As an exercise, show that you can write the Dirac Lagrangian as 

C = = ir L ifiif L + + f R fi) (24) 

The kinetic energy connects left to left and right to right, while the mass term connects 
left to right and right to left. 

The transformation xjr -> e ,0 ir leaves the Lagrangian £ invariant. Applying Noether’s 
theorem, we obtain the conserved current associated with this symmetry J ,J — {[ry^xlr. 
Projecting into left and right handed fields we see that they transform the same way: 
if l e'Vz. and if r e' d if R - 

If m — 0, £ enjoys an additional symmetry, known as a chiral symmetry, under which 
xjf —»■ Noether’s theorem tells us that the axial current = \[ry ll y 5 x[r is conserved. 

The left and right handed fields transform in opposite ways: i [r L —*■ e~'^i/ L and i fr R —* 
e'^i/ R . These points are particularly obvious when £ is written in terms of ir L and \// R , as 
in (24). 

In 1956 Lee and Yang proposed that the weak interaction does not preserve parity. It was 
eventually realized (with these four words I brush over a beautiful chapter in the history 
of particle physics; I urge you to read about it!) that the weak interaction Lagrangian has 
the generic form 

£ = Gf 1L y^ir 2 LifiLYiiif4-L (25) 

where ^, 2 , 3,4 denotes four Dirac fields and G the Fermi coupling constant. This La¬ 
grangian clearly violates parity: Under a spatial reflection, left handed fields are trans¬ 
formed into right handed fields and vice versa. 

Incidentally, henceforth when I say a Lagrangian has a certain form, I will usually 
indicate only one or more of the relevant terms in the Lagrangian, as in (25). The other 
terms in the Lagrangian, such as ir\{i$ — m{)ir lt are understood. If the term is not 
hermitean, then it is understood that we also add its hermitean conjugate. 


Interactions 

As we saw in (25) given the classification of bilinears in the spinor field you worked out 
in an exercise it is easy to introduce interactions. As another example, we can couple 
a scalar field (p to the Dirac field by adding the term gtpxjrxjr (with g some coupling 
constant) to the Lagrangian £ — xjf (i$ — m)x[r (and of course also adding the Lagrangian 
for <p). Similarly, we can couple a vector field A^ by adding the term eA^xj/y^xfr. We 
note that in this case we can introduce the covariant derivative D /t = 3 /t — ieA^ and write 
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C — i/(i$ — m)^f + = i Jrdy^D^ — w)i/r.Thus, the Lagrangian for a Dirac field- 

interacting with a vector field of mass /x reads 

C = fdv^D^ - m)f - - l/A/'* (26) 

If the mass /x vanishes, this is the Lagrangian for quantum electrodynamics. Varying with 
respect to \[r, we obtain the Dirac equation in the presence of an electromagnetic field: 

[»>'*(3„ - - m]f = 0 (27) 


Charge conjugation and antimatter 

With coupling to the electromagnetic field, we have the concept of charge and hence of 
charge conjugation. Let us try to flip the charge e. Take the complex conjugate of (27): 
[—;y ,x *(9 ;x + ieA — m]\jr* — 0. Complex conjugating (2) we see that the — y M * also satisfy 
the Clifford algebra and thus must be the y 11 matrices expressed in a different basis, that 
is, there exists a matrix Cy° (the notation with an explicit factor of y° is standard; see 
below) such that — y M * = ( Cy°)~ 1 y M (Cy °). Plugging in, we find that 

[iY tl (d t i + ieA IJ )-m]xlr c = 0 (28) 

where we have defined i/c = Cy°ir*. Thus, if i// is the field of the electron, then i jr c is the 
field of a particle with a charge opposite to that of the electron but with the same mass, 
namely the positron. 

The discovery of antimatter was one of the most momentous in twentieth-century 
physics. We will discuss antimatter in more detail in the next chapter. 

It may be instructive to look at the specific form of the charge conjugation matrix C. We 
can write the defining equation for C as Cy°y fl *y 0 C~ 1 — —y^. Complex conjugating the 
equation (y 11 ) 1 = y°y^y°, we obtain (y M ) r = y°y /J '*y 0 if y° is real. Thus, 

(y^) r = -C“ VC (29) 

which explains why C is defined with ay 0 attached. 

In both the Dirac and the Weyl bases y 2 is the only imaginary gamma matrix. Then the 
defining equation for C just says that Cy° commutes with y 2 but anticommutes with the 
other three y matrices. So evidently C — y 2 y° [up to an arbitrary phase not fixed by (29)] 
and indeed y 2 y^*y 2 = y^. Note that we have the simple (and satisfying) relation 

f c = Y 2 r (30) 

You can easily convince yourself (exercise II. 1.9) that the charge conjugate of a left 
handed field is right handed and vice versa. As we will see later, this fact turns out to 
be crucial in the construction of grand unified theory. Experimentally, it is known that the 
neutrino is left handed. Thus, we can now predict that the antineutrino is right handed. 
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Furthermore, xp c transforms as a spinor. Let’s check: Under a Lorentz transformation 
xp —> complex conjugating we have xp* —» g .+<V 4 )<'v(°''”’)* xp*; hence xp c —> 

y2- e +(i/4)a> ll v ( crl * v )* = e -('/ 4 )"^v or " 1 [Recall from (10) that cr ,n ’ is defined with an explicit 

«'•] 

Note that C T — y°y 2 — —C in both the Dirac and the Weyl bases. 

Majorana neutrino 

Since xp c transforms as a spinor, Majorana 2 noted that Lorentz invariance allows not only 
the Dirac equation ijfxp = mxp but also the Majorana equation 

ipxp — mxp c (31) 

Complex conjugating this equation and multiplying by y 2 , we have —y 2 iy ll *d fl xp* — 
y 2 m(—y 2 )\p, that is, i$xp c = mxp. Thus, — d 2 xp = i$(ifHxp) = i$mxp c = m 2 xp. As we antic¬ 
ipated, m is indeed the mass, known as a Majorana mass, of the particle associated with 
xp. 

The Majorana equation (31) can be obtained from the Lagrangian 3 4 
C = xpipxp — \m(xp T C\p + xpCxp T ) (32) 

upon varying i p. 

Since i p and \p c carry opposite charge, the Majorana equation, unlike the Dirac equation, 
can only be applied to electrically neutral fields. However, as i p c is right handed if xp is left 
handed, the Majorana equation, again unlike the Dirac equation, preserves handedness. 
Thus, the Majorana equation is almost tailor made for the neutrino. 

From its conception the neutrino was assumed to be massless, but couple of years ago 
experimentalists established that it has a small but nonvanishing mass. As of this writing, it 
is not known whether the neutrino mass is Dirac or Majorana. We will see in chapter VII.7 
that a Majorana mass for the neutrino arises naturally in the S 0(10) grand unified theory. 

Finally, there is the possibility that \p — \p c , in which case xp is known as a Majorana 
spinor. 


Time reversal 

Finally, we come to time reversal, which as you probably know, is much more confusing 
to discuss than parity and charge conjugation. In a famous 1932 paper Wigner showed that 

2 Ettore Majorana had a brilliant but tragically short career. In his early thirties, he disappeared off the coast 
of Sicily during a boat trip. The precise cause of his death remains a mystery. See F. Guerra and N. Robotti, Ettore 
Majorana: Aspects of His Scientific and Academic Activity. 

3 Upon recalling that C is antisymmetric, you may have worried that \{/ T C\fr = C a p i/fg vanishes. In future 
chapters we will learn that i Jr has to be treated as anticommuting “Grassmannian numbers.” 

4 Incidentally, I do not feel that we completely understand the implications of time-reversal invariance. See 
A. Zee, “Night thoughts on consciousness and time reversal,” in: Art and Symmetry in Experimental Physics: pp. 
246-249 . 
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time reversal is represented by an antiunitary operator. Since this peculiar feature already 
appears in nonrelativistic quantum physics, it is in some sense not the responsibility of a 
book on relativistic quantum field theory to explain time reversal as an antiunitary operator. 
Nevertheless, let me try to be as clear as possible. I adopt the approach of “letting the 
physics, namely the equations, lead us.” 

Take the Schrodinger equation ;(3/3f)'I / (f) = H'i'(t) (and for definiteness, think of 
H — —(1/2 m)V 2 + V(x), just simple one particle nonrelativistic quantum mechanics.) 
We suppress the dependence of Vk on x. Consider the transformation t —> t’ — —t . We 
want to find a 'k'(f') such that ^(^/^r , ) v k , (r , ) — Write 'k'(d) = 7’'k(f), where T 

is some operator to be determined (up to some arbitrary phase factor 17 ). Plugging in, we 
have 2[3/3(—t)]7 , 'k(r) = HT^it). Multiply by T~ l , and we obtain T~ 1 (—i)T(d/dt)'i>(t) = 
r _ 1 //r v k(f). Since H does not involve time in any way, we want T~^H — HT~ X . Then 
r - 1 (—i)T(3/3r)'k(r) = H'li(t). We are forced to conclude, as Wigner was, that 

T~ 1 (—i)T = i (33) 

Speaking colloquially, we can say that in quantum physics time goes with an i and so 
flipping time means flipping i as well. 

Let T — UK, where K complex conjugates everything to its right. Then T -1 = KU -1 
and (33) holds if U~ l i U — i , that is, if t / -1 is just an ordinary (unitary) operator that does 
nothing to i . We will determine U as we go along. The presence of K makes T “antiunitary.” 

We check that this works for a spinless particle in a plane wave state v k(f) = e >( k - x ~ Et \ 
Plugging in, we have = rd'(f) = UK^(t) — {/'k*(r) = s j nce vp has 

only one component, U is just a phase factor 5 ;/ that we can choose to be 1. Rewriting, we 
have dP(f) = g-iik-x+Et) _ e i(-k-x-Et) i n( Jeed, 'R / describes a plane wave moving in the 
opposite direction. Crucially, '!''(?) oc e~ lEt and thus has positive energy as it should. Note 
that acting on a spinless particle T 2 = UKUK = UU*K 2 — +1. 

Next consider a spin \ nonrelativistic electron. Acting with T on the spin-up state ^ q ^ 

we want to obtain the spin-down state ( 1 ^) • Thus, we need a nontrivial matrix U = i]a 2 
to flip the spin: 



Similarly, T acting on the spin-down state produces the spin-up state. Note that acting on 
a spin \ particle 

T 2 = rio 2 Kr)(T 2 K = tja 2 r]*a*KK = -1 

This is the origin of Kramer’s degeneracy: In a system with an odd number of electrons 
in an electric field, no matter how complicated, each energy level is twofold degenerate. 
The proof is very simple: Since the system is time reversal invariant, 'k and T Vk have the 
same energy. Suppose they actually represent the same state. Then T'k = e'“Vk, but then 


5 It is a phase factor rather than an arbitrary complex number because we require that | = 't | 2 . 
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T 2 ^ = T ( T'i ')) — T e lc,K i> — e _,Q ' 7 ’vp = ^ ^ —'I'. So 'I' and T 'I' must represent two distinct 
states. 

All of this is beautiful stuff, which as I noted earlier you could and should have learned 
in a decent course on quantum mechanics. My responsibility here is to show you how it 
works for the Dirac equation. Multiplying (1) by y° from the left, we have i(d/dt)i/r(t) — 
H\jr{t) with H — -iy°y'di + y°m. Once again, we want with 

= T\j/{t) and T some operator to be determined. The discussion above carries 
over if T~ l HT — H, that is, KU~ l HUK — H. Thus, we require KU~ 1 y°UK — y° and 
KU~ 1 (iy°y‘)UK = iy°y‘. Multiplying by K on the left and on the right, we see that 
we have to solve for a U such that U~ 1 y°U = y°* and U~ 1 y' U — —y l *. We now restrict 
ourselves to the Dirac and Weyl bases, in both of which y 2 is the only imaginary guy. Okay, 
what flips y 1 and y 3 but not y° and y 2 } Well, U — r]y 1 y 2 (with r] an arbitrary phase factor) 
works: 

i Js\t') = riy 1 y i Ki/f(t) (34) 

Since the y "s are the same in both the Dirac and the Weyl bases, in either we have from 

( 4 ) 

U = ^(cr 1 ® (t 2 )(<t 3 ® iz 2 ) = via 2 ® 1 

As we expect, acting on the 2-component spinors contained in \jj , the time reversal operator 
T involves multiplying by icr 2 . Note also that as in the nonrelativistic case T 2 x[r — —i jr. 

It may not have escaped your notice that y° appears in the parity operator (18), y 2 in 
charge conjugation (30), and y 1 y i in time reversal (34). If we change a Dirac particle to its 
antiparticle and flip spacetime, y 5 appears. 


CPT theorem 

There exists a profound theorem stating that any local Lorentz invariant field theory must 
be invariant under 6 CVT , the combined action of charge conjugation, parity, and time 
reversal. The pedestrian proof consists simply of checking that any Lorentz invariant local 
interaction you can write down [such as (25)], while it may break charge conjugation, 
parity, or time reversal separately, respects CVT. The more fundamental proof involves 
considerable formal machinery that I will not develop here. You are urged to read about 
the phenomenological study of charge conjugation, parity, time reversal, and CVT , surely 
one of the most fascinating chapters in the history of physics. 7 


6 A rather pedantic point, but potentially confusing to some students, is that I distinguish carefully between the 
action of charge conjugation C and the matrix C: Charge conjugation C involves taking the complex conjugate of 
x(/ and then scrambling the components with Cy°. Similarly, I distinguish between the operation of time reversal 
T and the matrix T. 

7 See, e.g., J. J. Sakurai, Invariance Principles and Elementary Particles and E. D. Commins, Weak Interactions. 



II.i. The Dirac Equation | 105 


Two stories 

I end this chapter with two of my favorite physics stories—one short and one long. 

Paul Dirac was notoriously a man of few words. Dick Feynman told the story that when 
he first met Dirac at a conference, Dirac said after a long silence, “I have an equation; do 
you have one too?” 

Enrico Fermi did not usually take notes, but during the 1948 Pocono conference (see 
chapter 1.7) he took voluminous notes during Julian Schwinger’s lecture. When he got 
back to Chicago, he assembled a group consisting of two professors, Edward Teller and 
Gregory Wentzel, and four graduate students, Geoff Chew, Murph Goldberger, Marshall 
Rosenbluth, and Chen-Ning Yang (all to become major figures later). The group met in 
Fermi’s office several times a week, a couple of hours each time, to try to figure out what 
Schwinger had done. After 6 weeks, everyone was exhausted. Then someone asked, “Didn’t 
Feynman also speak?” The three professors, who had attended the conference, said yes. 
But when pressed, not Fermi, nor Teller, nor Wentzel could recall what Feynman had said. 
All they remembered was his strange notation: p with a funny slash through it. 8 


Exercises 

11.1.1 Show that the following bilinears in the spinor field i/ri/r, '\jj y l ‘ '! f , l /' rr/M 'V / , by / ' y h//, and ifry 5 1 /' trans- 
form under the Lorentz group and parity as a scalar, a vector, a tensor, a pseudovector or axial vector, and 
a pseudoscalar, respectively. [Hint: For example, 0y^y s 0 — > 0[1 + {i / A)(oa]y ,x y\l — (//4)<y<r]0 under 
an infinitesimal Lorentz transformation and —> \jry Q y ,J/ y^y () \J/ under parity. Work out these transforma¬ 
tion laws and show that they define an axial vector.] 

11.1.2 Write all the bilinears in the preceding exercise in terms of 0^ and 0^. 

11.1.3 Solve ( ft — m)\J/(p) = 0 explicitly (by rotational invariance it suffices to solve it for p along the 3rd 
direction, say). Verify that indeed x is much smaller than 0 for a slowly moving electron. What happens 
for a fast moving electron? 

11. 1.4 Exploiting the fact that x is much smaller than 0 for a slowly moving electron, find the approximate 
equation satisfied by 0. 

11.1.5 For a relativistic electron moving along the z-axis, perform a rotation around the z-axis. In other words, 
study the effect of £-0/4)^cr 12 on anc [ verify the assertion in the text regarding 0 L and 0^. 

H.i.6 Solve the massless Dirac equation. 

11.1.7 Show explicitly that (25) violates parity. 

11.1.8 The defining equation for C evidently fixes C only up to an overall constant. Show that this constant is 
fixed by requiring (0 C ) C = 0. 


8 C. N. Yang, Lecture at the Schwinger Memorial Session of the American Physical Society meeting in 
Washington D. C., 1995. 
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H.i .9 Show that the charge conjugate of a left handed field is right handed and vice versa. 

11. 1.1 o Show that yJrCyjr is a Lorentz scalar. 

Il.i.n Work out the Dirac equation in (1 + 1)-dimensional spacetime. 

11 . 1.12 Work out the Dirac equation in (2 + 1)-dimensional spacetime. Show that the apparently innocuous 
mass term violates parity and time reversal. [Hint: The three y^’s are just the three Pauli matrices with 
appropriate factors of i .] 




Quantizing the Dirac Field 


Anticommutation 


We will use the canonical formalism of chapter 1.8 to quantize the Dirac field. 

Long and careful study of atomic spectroscopy revealed that the wave function of two 
electrons had to be antisymmetric upon exchange of their quantum numbers. It follows 
that we cannot put two electrons into the same energy level so that they will have the 
same quantum numbers. In 1928 Jordan and Wigner showed how this requirement of an 
antisymmetric wave function can be formalized by having the creation and annihilation 
operators for electrons satisfy anticommutation rather than commutation relations as in 
( 1 . 8 . 12 ). 

Let us start out with a state with no electron |0) and denote by b^ the operator creating 
an electron with the quantum numbers a. In other words, the state b ] a |0) is the state 
with an electron having the quantum numbers a. Now suppose we want to have another 
electron with the quantum numbers so we construct the state bpb ] a |0). For this to be 
antisymmetric upon interchanging a and /S we must have 


{blbl} = blbl + blbl = 0 


Upon hermitean conjugation, we have {b a , bp] — 0. In particular, b^bh = 0, so that we 
cannot create two electrons with the same quantum numbers. 

To this anticommutation relation we add 


{ b a ,b\} = S afi (2) 

One way of arguing for this is to say that we would like the number operator to be 
N — bab a , just as in the bosonic case. Show with one line of algebra that [AB, C ] = 
A[B, C] + [A, C]B or [AB , C] = A{B , C } — {A, C}B. (A heuristic way of remembering 
the minus sign in the anticommuting case is that we have to move C past B in order for 
C to do its anticommuting with A.) For the desired number operator to work we need 
Eff bab a , bp] = +bp (so that as usual N |0) = 0, and Nb 'p |0) = b 'p |0)) and so we have (2). 
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The Dirac field 

Let us now turn to the free Dirac Lagrangian 
C = — m)\lf (3) 

-j- 

The momentum conjugate to i// is n a — SC/8d t \l/ a = iijr . We anticipate that the correct 
canonical procedure requires imposing the anticommutation relation: 


[x/r a (x,t),irl(0,t)} = S^\x)S a(l (4) 

We will derive this below. 

The Dirac field satisfies 

(ijf-m) f = 0 (5) 

Plugging in plane waves u(p, s)e~ lpx and i >(p, s)e' px for i fr, we have 

(p — m)u(p, 5) = 0 (6) 

and 

(p + m)v(p, s) = 0 (7) 


The index s = ±1 reminds us that each of these two equations has two solutions, spin 
up and spin down. Evidently, under a Lorentz transformation the two spinors u and v 
transform in the same way as xfr. Thus, if we define u = u f y° and v = iby 0 , then uu and 
vv are Lorentz scalars. 

This subject is full of “peculiar” signs and so I will proceed very carefully and show you 
how every sign makes sense. 

First, since ( 6 ) and (7) are linear we have to fix the normalization of u and v. Since 
u(p, s)u(p, s) and v(p, s)v(p, s) are Lorentz scalars, the normalization condition we 
impose on them in the rest frame will hold in any frame. 

Our strategy is to do things in the rest frame using a particular basis and then invoice 
Lorentz invariance and basis independence. In the rest frame, ( 6 ) and (7) reduce to 
(y° — Y)u — 0 and (y° + l)i> = 0. In particular, in the Dirac basis K° = ( q ) , so the 
two independent spinors u (labeled by spin s = ± 1 ) have the form 

1 \ 

0 

and 

0 

0 / 
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while the two independent spinors v have the form 



The normalization conditions we have implicitly chosen are then u(p, s)u(p, s) — 1 and 
v(p, s)v(p, s) = — 1. Note the minus sign thrust upon us. Clearly, we also have the orthog¬ 
onality condition uv = 0 and vu — 0. Lorentz invariance and basis independence then tell 
us that these four relations hold in general. 

Furthermore, in the rest frame 


^2u a (p,s)u p (p,s)=l ) =hy° + l) a/3 

\° Z 

and 

V a (p,s)Vp(p,s)= [ ) = ^(K° — l)a/3 

s \° ~ ! / aft Z 

Thus, in general 

u a(p, s)iip(p, s) = ^ ( 8 ) 

and 


J2 v a(P’ S )Vfl(p,s) 

S 


/ p-m \ 

V 2 m ) ap 


( 9 ) 


Another way of deriving (8) is to note that the left hand side is a 4 by 4 matrix (it is like a 
column vector multiplied by a row vector on the right) and so must be a linear combination 
of the sixteen 4 by 4 matrices we listed in chapter II. 1. Argue that y 5 and are ruled out 
by parity and that a is ruled out by Lorentz invariance and the fact that only one Lorentz 
vector, namely p 1 ', is available. Hence the right hand side must be a linear combination of 
ft and m. Fix the relative coefficient by acting with p — m from the left. The normalization 
is fixed by setting a — and summing over a. Similarly for (9). In particular, setting a — /3 
and summing over a, we recover v(p, s)v(p, s) — —1. 

We are now ready to promote 1 Jr{x) to an operator. In analogy with (1.8.11) we expand 
the field in plane waves 1 


tab) = 

/ i3 

- 3 — -f y ~\[b(p, s)u a (p, s)e~ lpx +d (p, s)v a {p, s)e lpx ] (10) 

(2n)HE p /m)2 7 

(Here E p = p 0 — +V p 2 + m 2 and px — p p x p .) The normalization factor ( E p /m )2 is 
slightly different from that in (1.8.11) for reasons we will see. Otherwise, the rationale 


1 The notation is standard. See e.g., J. A. Bjorken and S. D. Drell, Relativistic Quantum Mechanics. 
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for (10) is essentially the same as in (1.8.11). We integrate over momentum p, sum over 
spin s, expand in plane waves, and give names to the coefficients in the expansion. Because 
x[r is complex, we have, similar to the complex scalar field in chapter 1.8, a b operator and 
a d l operator. 

Just as in chapter 1.8, the operators b and d' must carry the same charge. Thus, if b 
annihilates an electron with charge e — — |e|, d ' must remove charge e; that is, it creates 
a positron with charge — e — \e\. 

A word on notation: in (10) b(p, s), d ' (p, s), u(p, s), andi t(p, 5 ) are written as functions 
of the 4-momentum p but strictly speaking they are functions of p only, with p° always 
understood to be +^p 2 + m 2 . 

Thus let b\p, s) and b(p, s ) be the creation and annihilation operators for an electron 
of momentum p and spin s. Our introductory discussion indicates that we should impose 

{ b{p, s), b\p’, /)) = S (3 \p - p’)8 ss , (11) 

{Hp,s),b(p',s')} = 0 (12) 

[b\p, s), b\p', s')} = 0 (13) 

There is a corresponding set of relations for d ( (p, s) and d(p, s) the creation and annihi¬ 
lation operators for a positron, for instance, 

[d( P , s), d\p', 5 ')} = <5 (3, (p - p')S ss , (14) 


We now have to show that we indeed obtain (4). Write 

/ /3 / 

--T s')u{p', s') + d(p', s')v(p', s')] 

(2n)HE p ,/m)2 ~ 

Nothing to do but to plow ahead: 




/ 


d 3 p 

(27 r)HE p /m) 


^2[u(p, s)u(p, s)e’P' x + v{p, s)v(p, s)e 'P' x ] 


if we take b and b' ( to anticommute with d and d . Using (8) and (9) we obtain 




{2n)\2 E.) 


l(p + m)e lp ' x + (p — m)e ,p ' x ] 


-I 


which is just (4) slightly disguised. 

Similarly, writing schematically, we have {ip, ip} — 0 and {i/ r , xp } = 0. 

We are of course free to normalize the spinors u and v however we like. One alternative 
normalization is to define u and v as the u and v given here multiplied by (2m) 2 , thus 
changing (8) and (9) to ^ s «m —p + m and X] s t>i) —p — m. Multiplying the numerator and 
denominator in (10) by (2m) 2 , we see that the normalization factor (E p /m) 2 is changed to 
(2E p )i [thus making it the same as the normalization factor for the scalar field in (1.8.11)]. 
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This alternative normalization (let us call it “any mass normalization”) is particularly 
convenient when we deal with massless spin-| particles: we could set m = 0 everywhere 
without ever encountering m in the denominator as in ( 8 ) and (9). 

The advantage of the normalization used here (let us call it “rest normalization”) is that 
the spinors assume simple forms in the rest frame, as we have just seen. This would prove 
to be advantageous when we calculate the magnetic moment of the electron in chapter 
III. 6 , for example. Of course, multiplying and dividing here and there by (2m) 2 is a trivial 
operation, and there is not much sense in arguing over the relative advantages of one 
normalization over another. 

In chapter II .6 we will calculate electron scattering at energies high compared to the 
mass m, so that effectively we could set m — 0. Actually, even then, “rest normalization” 
has the slight advantage of providing a (rather weak) check on the calculation. We set m — 0 
everywhere we can, such as in the numerator of ( 8 ) and (9), but not where we can’t, such as 
in the denominator. Then m must cancel out in physical quantities such as the differential 
scattering cross section. 


Energy of the vacuum 


An important exercise at this point is to calculate the Hamiltonian starting with the 
Hamiltonian density 


7 i = jr— — C = i /s(iy ■ 3 + m)xjf 
3 1 


Inserting (10) into this expression and integrating, we have 

H = J d 3 xTi = j d 3 x\jr(iy -3 + m)^r = J d 3 xxjriy 0 ^- 
which works out to be 


3 1 


H 


j d 3 p ^2 E p [b\p, s)b{p, s) - d{p, s)d\p, s)] 


(15) 


(16) 


(17) 


We can see the all important minus sign in (17) schematically: In (16) 1 (r gives a factor 

i" i* 

~ (b + d), while 3/3 1 acting on xfr brings down a relative minus sign giving ~ (b — d ), 

"I" -j- -j- | 

thus giving us ~ (b + d)(b — d ) ~ b b — dd (orthogonality between spinors vu = 0 kills 
the cross terms). 

-j- 

To bring the second term in (17) into the right order, we anticommute — d(p , s)d (p, s) 
— d 1 (p, s)d(p, i) — <5®(0) so that 

H = f d 3 p Y2 EpV^ip, s)b(p, s ) + d t (p, s)d(p, s)] 

J s 

-8^(0) f d 3 P J2 E P < 18 ) 

J s 

The first two terms tell us that each electron and each positron of momentum p and spin 
s has exactly the same energy E p , as it should. But what about the last term? That < 5 < 3 , ( 0 ) 
should fill us with fear and loathing. 



112 | II. Dirac and the Spinor 


It is OK: Noting that S (i ^(p) — [1/(27t) 3 ] f d^xe lpx , we see that <5 (33 (0) = [1/(2tt) 3 ] f d 3 x 
(we encounter the same maneuver in exercise 1.8.2) and so the last term contributes to H 

£° = -p/ d l x f d 3 P J2 2{\E p ) (19) 

(since in natural units we have h — 1 and hence h — 2n). We have an energy — \ E p in each 
unit-size phase-space cell (l//? 3 )d 3 .v d i p in the sense of statistical mechanics, for each spin 
and for the electron and positron separately (hence the factor of 2) . This infinite additive 
term E 0 is precisely the analog of the zero point energy \ Ha> of the harmonic oscillator you 
encountered in your quantum mechanics course. But it comes in with a minus sign! 

The sign is bizarre and peculiar! Each mode of the Dirac field contributes —\hco to 
the vacuum energy. In contrast, each mode of a scalar field contributes \hw as we saw 
in chapter 1.8. This fact is of crucial importance in the development of supersymmetry, 
which we will discuss in chapter VIII.4. 


Fermion propagating through spacetime 


In analogy with (1.8.14), the propagator for the electron is given by iS a p(x) = 
(0| T\// a (x)\jfp(0) |0), where the argument of i Jr has been set to 0 by translation invariance. 
As we will see, the anticommuting character of i j/ requires us to define the time-ordered 
product with a minus sign, namely 

rV'M'MO) = 9 (.rVoOVTO) - 9(-x°)ir(0)^(x) (20) 


Referring to (10), we obtain for x° > 0, 

iS(x) = (0| Vf(x)i^(0) |0) = f - 1 -- Y u(p, s)u(p, s)e~' px 

J (2n)HE p /m) ^ 

= f C ^P t> + m c -ipx 

J ( 2n) 3 (E p /m ) 2m 

For x° < 0, we have to be a bit careful about the spinorial indices: 


iM*) = -<°l */>(())*«(*) l°> 


"/ 

-/ 


d 3 p 


(2 n)\E p /m) 


d^p 


(2 n^iEp/m) 2m 

using the identity (9). 

Putting things together we obtain 

d 3 p 


yv f) (p,s)v a (p,s)e ,px 

S 


iS(x) — 


f 


(2n)HE p /m) 


e(v u ) 


o, t + m,-ipx _ g( x o ) P- m „+ipx 


2m 


2m 


( 21 ) 


We will now show that this fermion propagator can be written more elegantly as a 
4-dimensional integral: 



11 . 2 . Quantizing the Dirac Field | 113 


i Six) = i 


'! 


d P „-n 


p + m 


(2 np 


f 


d A P „-u 


( 22 ) 


p 2 — m 2 +is J {2tc) a p — m + is 
To show that (22) is indeed equivalent to (21) we go through essentially the same steps as 
after (1.8.14). In the complex p° plane the integrand has poles at p° = ±\/ p 2 + m 2 — is — 
± (E p — is). Forx 0 > 0 the factor e~ lp ° x ° tells us to close the contour in the lower half-plane. 
We go around the pole at +(E — is) clockwise and obtain 


iS(x) = (-i)i 


■I 


d i P -i„. x P + m 


(2tt)3 


2E„ 


producing the first term in (21). For x° < 0 we are now told to close the contour in the 
upper half-plane and thus we go around the pole at — (E — is) anticlockwise. We obtain 


iS(x) =; 


i2 / 


d 3 p 


7 +iE p x°+ip‘X 


(2tt) 3 

and flipping p we have 


—2E, 


-(~ E pY ~ PY + m) 


iS(x ) = — 


-f 


drp 

(2n) i 


yiP'X 


ztt( e p Y — py — m) = — 

Zb p 


-J 


^P_ ‘P^-iP-m) 

(2tt) 3 2E„ 


precisely the second term in (21) with the minus sign and all. Thus, we must define the 
time-ordered product with the minus sign as in (20). 

After all these steps, we see that in momentum space the fermion propagator has the 
elegant form 


iS(p) = 


(23) 


p — m + is 

This makes perfect sense: S(p) comes out to be the inverse of the Dirac operator p — m, 
just as the scalar boson propagator D(k) — 1 /(k 2 — m 2 + is) is the inverse of the Klein- 
Gordon operator k 2 — m 2 . 


Poetic but confusing metaphors 

In closing this chapter let me ask you some rhetorical questions. Did I speak of an 
electron going backward in time? Did I mumble something about a sea of negative energy 
electrons? This metaphorical language, when used by brilliant minds, the likes of Dirac 
and Feynman, was evocative and inspirational, but unfortunately confused generations 
of physics students and physicists. The presentation given here is in the modern spirit, 
which seeks to avoid these potentially confusing metaphors. 


Exercises 

11 .2.1 Use Noether’s theorem to derive the conserved current = xj/y^xl/. Calculate [Q, \fr], thus showing that 
b and d ' must carry the same charge. 

11. 2.2 Quantize the Dirac field in a box of volume of V and show that the vacuum energy E 0 is indeed 
proportional to V. [Hint: The integral over momentum f d^p is replaced by a sum over discrete values 
of the momentum.] 



Lorentz Group and Weyl Spinors 


The Lorentz algebra 

In chapter II. 1 we followed Dirac’s brilliantly idiosyncratic way of deriving his equation. 
We develop here a more logical and mathematical theory of the Dirac spinor. A deeper 
understanding of the Dirac spinor not only gives us a certain satisfaction, but is also 
indispensable, as we will see later, in studying supersymmetry, one of the foundational 
concepts of superstring theory; and of course, most of the fundamental particles such as 
the electron and the quarks carry spin \ and are described by spinor fields. 

Let us begin by reminding ourselves how the rotation group works. The three generators 
(i = 1, 2, 3 or x, y, z) of the rotation group satisfy the commutation relation 

[Ji, Jj] = ie ijk J k (1) 

When acting on the spacetime coordinates, written as a column vector 

( X °\ 

x 1 

x 2 

vW 

the generators of rotations are represented by the hermitean matrices 
0 0 0 0 
0 0 0 0 
0 0 0 -i 
0 0 i 0 

with / 2 and J 3 obtained by cyclic permutations. You should verify by laboriously multi¬ 
plying these three matrices that (1) is satisfied. Note that the signs of Jj are fixed by the 
commutation relation (1). 
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Now add the Lorentz boosts. A boost in the x = x 1 direction transforms the spacetime 
coordinates: 

t' = (cosh 1 p) t + (sinh <p) x; x' = (sinh <p) t + (cosh tp) x (3) 

or for infinitesimal 

t' = t + <px; x' = x + <pt ( 4 ) 

In other words, the infinitesimal generator of a Lorentz boost in the x direction is repre¬ 
sented by the hermitean matrix (v° = t as usual) 

0 10 0 
10 0 0 
0 0 0 0 
0 0 0 0 

Similarly, 

0 0 10 
0 0 0 0 
10 0 0 
0 0 0 0 

I leave it to you to write down K 3. Note that K { is defined to be antihermitean. 

Check that Kj] — i€jj k K k . To see that this implies that the boost generators K , 
transform as a 3-vector K under rotation, as you would expect, apply a rotation through 
an infinitesimal angle 9 around the 3-axis. Then (you might wish to review the material in 
appendix B at this point) K l -> e iej} K 1 e ~ ,ej3 = K 1 + i9[J ir K{[ + 0{6 2 ) = K 1 + i6(iK 2 ) + 
0(6 2 ) = cos 9 Ki — sind K 2 to the order indicated. 

You are now about to do one of the most significant calculations in the history of 
twentieth century physics. By brute force compute [K k , K 2 ], evidently an antisymmetric 
matrix. You will discover that it is equal to —iJ 2 . Two Lorentz boosts produce a rotation! 
(You might recall from your course on electromagnetism that this mathematical fact is 
responsible for the physics of the Thomas precession.) 

Mathematically, the generators of the Lorentz group satisfy the following algebra [known 
to the cognoscenti as SO( 3,1)]: 

[*f| ’ Jj] = l ^ljk A 
[Jj, Kj] = ie ijk K k 
[K h Kj] = -i€jj k J k 






Note the all-important minus sign! 


( 7 ) 

( 8 ) 
( 9 ) 
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How do we study this algebra? The crucial observation is that the algebra falls apart into 
two pieces if we form the combinations J ±( = \ (J, ± i K t ). You should check that 


[J+i, J+ j\ — i€ijk J. j_^ 

(10) 

[J—i} J— j\ — jk^—k 

(11) 

and most remarkably 


[J+i,J-j] = o 

(12) 


This last commutation relation tells us that J + and J _ form two separate 5 U (2) algebras. 
(For more, see appendix B.) 


From algebra to representation 

This means that you can simply use what you have already learned about angular mo¬ 
mentum in elementary quantum mechanics and the representation of SU( 2) to deter¬ 
mine all the representations of 5 <9(3, 1). As you know, the representations of SU (2) 
are labeled by j = 0, 1, 1, \, • • • . We can think of each representation as consisting of 
(2 j + 1) objects i/'m with m — — j, — j + 1, • • •, j — 1, j that transform into each other 
under SU (2). It follows immediately that the representations of 5(9(3, 1) are labeled by 
(j + , j~) with j + and j~ each taking on the values 0, \, 1, ... . Each representation 

consists of (2 j + + 1)(2 j~ + 1) objects with m + = —j + , —j + + 1, ..., j + — 1, j + 

and m~ — -j~, -j~ + 1, - 1, j . 

Thus, the representations of 5(9(3, 1) are (0, 0), (±, 0), (0, J), (1, 0), (0, 1), (|, J), and 
so on, in order of increasing dimension. We recognize the 1-dimensional representation 
(0, 0) as clearly the trivial one, the Lorentz scalar. By counting dimensions, we expect 
that the 4-dimensional representation (I, I) has to be the Lorentz vector, the defining 
representation of the Lorentz group (see exercise II.3.1). 


Spinor representations 

What about the representation (1, 0)? Let us write the two objects as if/ a with a — 1,2. 
Well, what does the notation (l, 0) mean? It says that Ui = l (Jj + iKj) acting on x// a is 
represented by \oj while = \{Jj — iK ,■) acting on \// a is represented by 0. By adding 
and subtracting we find that 

Ji = iff; (13) 

and 

iKj = \a t 


(14) 
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where the equal sign means “represented by” in this context. (By convention we do not 
distinguish between upper and lower indices on the 3-dimensional quantities K jy and 
Oj .) Note again that K t is anti-hermitean. 

Similarly, let us denote the two objects in (0, |) by the peculiar symbol /“. I should em¬ 
phasize the trivial but potentially confusing point that unlike the bar used in chapter II.1, 
the bar on /“is a typographical element: Think of the symbol j as a letter in the Hittite 
alphabet if you like. Similarly, the symbol a bears no relation to a: we do not obtain a 
by operating on a in any way. The rather strange notation is known informally as “dotted 
and undotted” and more formally as the van der Waerden notation—a bit excessive for our 
rather modest purposes at this point but I introduce it because it is the notation used in 
supersymmetric physics and superstring theory. (Incidentally, Dirac allegedly said that he 
wished he had invented the dotted and undotted notation.) Repeating the same steps as 
above, you will find that on the representation (0, we have J t = and iK t = — ter,-. 
The minus sign is crucial. 

The 2-component spinors x[r a and /"are called Weyl spinors and furnish perfectly good 
representations of the Lorentz group. Why then does the Dirac spinor have 4 components ? 

The reason is parity. Under parity, x -* —x and p —»• —p, and thus / -* J and K -> —K , 
and so / + ** J_. In other words, under parity the representations (|, 0) ** (0, 1). There¬ 
fore, to describe the electron we must use both of these 2 -dimensional representations, or 
in mathematical notation, the 4-dimensional reducible representation (1, 0) © (0, 1). 

We thus stack two Weyl spinors together to form a Dirac spinor 


^ = 



(15) 


The spinor 'I >{p) is of course a function of 4-momentum p [and by implication also 1 lr a (p) 
and /“(/?)] but we will suppress the p dependence for the time being. Referring to (13) 
and (14) we see that acting on 'L the generators of rotation 



where once again the equality means “represented by,” and the generators of boost 


ik = 




Note once again the all-important minus sign. 

Parity forces us to have a 4-component spinor but we know on the other hand that the 
electron has only two physical degrees of freedom. Let us go to the rest frame. We must 
project out two of the components contained in ^(p r ) with the rest momentum p r = 
(m, 0). With the benefit of hindsight, we write the projection operator as V — |(1 — y°). 
You are probably guessing from the notation that y° will turn out to be one of the gamma 
matrices, but at this point, logically y° is just some 4 by 4 matrix. The condition V 2 — V 
implies that (y 0 ) 2 = 1 so that the eigenvalues of y° are ±1. Since 1 j/ a ** /“ under parity we 
naturally guess that 1 jr a and/“correspond to the left and right handed fields of chapter II. 1. 
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We cannot simply use the projection to set for example x “to 0. Parity means that we should 
treat \j/ a and /"on the same footing. We choose 


or more explicitly 

/0 0 1 0 \ 

0 0 0 1 
10 0 0 
\0 1 0 0 / 

(Different choices of y° correspond to the different basis choices discussed in chapter II.l.) 
In other words, in the rest frame if a — x“ = 0. The projection to two degrees of freedom 
can be written as 

(y° - imp,) = 0 (16) 

Indeed, we recognize this as just the Weyl basis introduced in chapter II.l. 


The Dirac equation 

We have derived the Dirac equation, a bit in disguise! 

Since our derivation is based on a step-by-step study of the spinor representation of the 
Lorentz group, we know how to obtain the equation satisfied by VP (p) for any p: We simply 
boost. Writing T(p) = e~ l ^ K 'l l (p r ), we have (e~ l $ K Y°e l $ K — l)'P(p) = 0. Introducing the 
notation = e~ ,<pK Y°e “ pK , we obtain the Dirac equation 

- m)'P(p) = 0 (17) 

You can work out the details as an exercise. 

The derivation here represents the deep group theoretic way of looking at the Dirac 
equation: It is a projection boosted into an arbitrary frame. 

Note that this is an example of the power of symmetry, which pervades modern physics 
and this book: Our knowledge of how the electron field transforms under the rotation 
group, namely that it has spin \, allows us to know how it transforms under the Lorentz 
group. Symmetry rules! 

In appendix E we will develop the dotted and undotted notation further for later use in 
the chapter on supersymmetry. 

In light of your deeper group theoretic understanding it is a good idea to reread chap¬ 
ter II.l and compare it with this chapter. 
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Exercises 


II.3.1 Show by explicit computation that (|, is indeed the Lorentz vector. 


II.3.2 Work out how the six objects contained in the (1, 0) and (0, 1) transform under the Lorentz group. 
Recall from your course on electromagnetism how the electric and magnetic fields E and B transform. 
Conclude that the electromagnetic field in fact transforms as (1, 0) 0 (0, 1). Show that it is parity that 
once again forces us to use a reducible representation. 


II.3.3 Show that 

/ 0 e-**\ 

e -i V K y O e i V K = l 

\eV a 0 / 

and 

e V<T = cosh cp + a ■ (p sinh tp 

with the unit vector (p = (p/(p. Identifying p = m(p sinh (p, derive the Dirac equation. Show that 



II.3.4 Show that a spin \ particle can be described by a vector-spinor ^ afl , namely a Dirac spinor carrying a 
Lorentz index. Find the corresponding equations of motion, known as the Rarita-Schwinger equations. 
[Hint: The object v I / 0f/x has 16 components, which we need to cut down to 2 • \ + 1 = 4 components.] 




Spin-Statistics Connection 


There is no one fact in the physical world which has 
a greater impact on the way things are, than the Pauli 
exclusion principle. 1 


Degrees of intellectual incompleteness 

In a course on nonrelativistic quantum mechanics you learned about the Pauli exclusion 
principle 2 and its later generalization stating that particles with half integer spins, such as 
electrons, obey Fermi-Dirac statistics and want to stay apart, while in contrast particles with 
integer spins, such as photons or pairs of electrons, obey Bose-Einstein statistics and love 
to stick together. From the microscopic structure of atoms to the macroscopic structure 
of neutron stars, a dazzling wealth of physical phenomena would be incomprehensible 
without this spin-statistics rule. Many elements of condensed matter physics, for instance, 
band structure, Fermi liquid theory, superfluidity, superconductivity, quantum Hall effect, 
and so on and so forth, are consequences of this rule. 

Quantum statistics, one of the most subtle concepts in physics, rests on the fact that in 
the quantum world, all elementary particles and hence all atoms, are absolutely identical 
to, and thus indistinguishable from, one other. 3 It should be recognized as a triumph of 
quantum field theory that it is able to explain absolute identity and indistinguishability 
easily and naturally. Every electron in the universe is an excitation in one and the same 
electron field i/r. Otherwise, one might be able to imagine that the electrons we now 


1 1. Duck and E. C. G. Sudarshan, Pauli and the Spin-Statistics Theorem, p. 21. 

2 While a student in Cambridge, E. C. Stoner came to within a hair of stating the exclusion principle. 
Pauli himself in his famous paper (Zeit.f. Physik 31: 765, 1925) only claimed to “summarize and generalize 
Stoner’s idea.” However, later in his Nobel Prize lecture Pauli was characteristically ungenerous toward Stoner’s 
contribution. A detailed and fascinating history of the spin and statistics connection may be found in Duck and 
Sudarshan, op. cit. 

3 Early in life, I read in one of George Gamow’s popular physics books that he could not explain quantum 
statistics—all he could manage for Fermi statistics was an analogy, invoking Greta Garbo’s famous remark “I 
vont to be alone.”—and that one would have to go to school to learn about it. Perhaps this spurs me, later in life, 
to write popular physics books also. See A. Zee, Einstein's Universe, p. x. 
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know came off an assembly line somewhere in the early universe and could all be slightly 
different owing to some negligence in the manufacturing process. 

While the spin-statistics rule has such a profound impact in quantum mechanics, its 
explanation had to wait for the development of relativistic quantum field theory. Imagine 
a civilization that for some reason developed quantum mechanics but has yet to discover 
special relativity. Physicists in this civilization eventually realize that they have to invent 
some rule to account for the phenomena mentioned above, none of which involves motion 
fast compared to the speed of light. Physics would have been intellectually unsatisfying 
and incomplete. 

One interesting criterion in comparing different areas of physics is their degree of 
intellectual incompleteness. 

Certainly, in physics we often accept a rule that cannot be explained until we move to the 
next level. For instance, in much of physics, we take as a given the fact that the charge of the 
proton and the charge of electron are exactly equal and opposite. Quantum electrodynamics 
by itself is not capable of explaining this striking fact either. This fact, charge quantization, 
can only be deduced by embedding quantum electrodynamics into a larger structure, such 
as a grand unified theory, as we will see in chapter VII. 6 . (In chapter IV.4 we will learn that 
the existence of magnetic monopoles implies charge quantization, but monopoles do not 
exist in pure quantum electrodynamics.) 

Thus, the explanation of the spin-statistics connection, by Fierz and by Pauli in the late 
1930s, and by Liiders and Zumino and by Burgoyne in the late 1950s, ranks as one of the 
great triumphs of relativistic quantum field theory. I do not have the space to give a general 
and rigorous proof 4 here. I will merely sketch what goes terribly wrong if we violate the 
spin-statistics connection. 


The price of perversity 

A basic quantum principle states that if two observables commute then they are simul¬ 
taneously diagonalizable and hence observable. A basic relativistic principle states that if 
two spacetime points are spacelike with respect to each other then no signal can propagate 
between them, and hence the measurement of an observable at one of the points cannot 
influence the measurement of another observable at the other point. 

Consider the charge density J 0 = i((p f d 0 (p — 3 Q(p'(p) in a charged scalar field theory. 
According to the two fundamental principles just enunciated, J 0 (x, t — 0) and / 0 (y, t — 0) 
should commute for x ^ y. In calculating the commutator of J 0 (x, t — 0) with J^iy, t — 0), 
we simply use the fact that tp(x, t = 0 ) and 3 0 cp(x, t = 0 ) commute with q>{y,t — 0 ) and 
3 0 <p(y, t — 0), so we just move the field at x steadily past the field at y. The commutator 
vanishes almost trivially. 


4 See I. Duck and E. C. G. Sudarshan, Pauli and the Spin-Statistics Theorem, and R. F. Streater and A. S. 
Wightman, PCT, Spin Statistics, and All That. 
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(p(x,t = 0) = J “ ^ — [ a(k)e' k ' x 4- g J( (k)e ,k ' x ] 


Now suppose we are perverse and quantize the creation and annihilation operators in 
the expansion (1.8.11) 

d D k 

y (2jt) D 2a> k 
according to anticommutation rules 
[a(k), a ' {q)} = S^ D \k — q) 
and 

[a(k), a(q) } = 0 = [a^(k), P(q)} 

instead of the correct commutation rules. 

What is the price of perversity? 

Now when we try to move Jq{y, t — 0) past J 0 (x, t = 0), we have to move the field at y 
past the field at x using the anticommutator 
{<p(x, t = 0), <p(y, t = 0)) 

d D k d D q 


-a 

-j 


y (2n) D 2a> k /(2n) D 2(o, 
d D k 


:{[a(k)e ik ' x + aT(k)e~ ik '*], [a(q)e iq - y + 


( e ‘ k <x~y) _|_ e -ik<x-y)^ 


( 2 ) 


(2n) D 2co k 

You see the problem? In a normal scalar-field theory that obeys the spin-statistics 
connection, we would have computed the commutator, and then in the last expression 
in (2) we would have gotten (e lk ' <x ^P — e~ ,k '^ x ~^) instead of -y e -ik-(x-y)y 

integral 

d D k 


/ 


_( e ‘k<*-y) _ e -ik<x-y)y 


(2n) D 2a>i c 

would obviously vanish and all would be well. With the plus sign, we get in (2) a nonvan¬ 
ishing piece of junk. A disaster if we quantize the scalar field as anticommuting! A spin 0 
field has to be commuting. Thus, relativity and quantum physics join hands to force the 
spin-statistics connection. 

It is sometimes said that because of electromagnetism you do not sink through the floor 
and because of gravity you do not float to the ceiling, and you would be sinking or floating 
in total darkness were it not for the weak interaction, which regulates stellar burning. 
Without the spin-statistics connection, electrons would not obey Pauli exclusion. Matter 
would just collapse. 5 


Exercise 

11.4.1 Show that we would also get into trouble if we quantize the Dirac field with commutation instead of 
anticommutation rules. Calculate the commutator [7°(x, 0), /°(0)]. 


5 The proof of the stability of matter, given by Dyson and Lenard, depends crucially on Pauli exclusion. 



N r Vacuum Energy, Grassmann Integrals, 
• and Feynman Diagrams for Fermions 


The vacuum is a boiling sea of nothingness, full of sound 
and fury, signifying a great deal. 

—Anonymous 


Fermions are weird 

I developed the quantum field theory of a scalar field (p{x) first in the path integral 
formalism and then in the canonical formalism. In contrast, I have thus far developed 
the quantum field theory of the free spin \ field \j/(x) only in the canonical formalism. 
We learned that the spin-statistics connection forces the field operator i //(x) to satisfy 
anticommutation relations. This immediately suggests something of a mystery in writing 
down the path integral for the spinor field \[r . In the path integral formalism i j/ (x) is not an 
operator but merely an integration variable. How do we express the fact that its operator 
counterpart in the canonical formalism anticommutes? 

We presumably cannot represent \j/ as a commuting variable, as we did (p. Indeed, we will 
discover that in the path integral formalism i fr is to be treated not as an ordinary complex 
number but as a novel kind of mathematical entity known as a Grassmann number. 

If you thought about it, you would realize that some novel mathematical structure is 
needed. In chapter 1.3 we promoted the coordinates of point particles <y, (t) in quantum 
mechanics to the notion of a scalar field (p(x, t). But you already know from quantum 
mechanics that a spin \ particle has the peculiar property that its wave function turns into 
minus itself when rotated through lit . Unlike particle coordinates, half integral spin is 
not an intuitive concept. 


Vacuum energy 


To motivate the introduction of Grassmann-valued fields I will discuss the notion of 
vacuum energy. The reason for this apparently strange strategy will become clear shortly. 

Quantum field theory was first developed to describe the scattering of photons and 
electrons, and later the scattering of particles. Recall that in chapter 1.7 while studying the 
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scattering of particles we encountered diagrams describing vacuum fluctuations, which 
we simply neglected (see fig. 1.7.12). Quite naturally, particle physicists considered these 
fluctuations to be of no importance. Experimentally, we scatter particles off each other. Who 
cares about fluctuations in the vacuum somewhere else? It was only in the early 1970s that 
physicists fully appreciated the importance of vacuum fluctuations. We will come back to 
the importance of the vacuum 1 in a later chapter. 

In chapters 1.8 and II.2 we calculated the vacuum energy of a free scalar field and of a free 
spinor field using the canonical formalism. To motivate the use of Grassmann numbers 
to formulate the path integral for the spinor field I will adopt the following strategy. First, 
I use the path integral formalism to obtain the result we already have for the free scalar 
field using the canonical formalism. Then we will see that in order to produce the result 
we already have for the free spinor field we must modify the path integral. 

By definition, vacuum fluctuations occur even when there are no sources to produce 
particles. Thus, let us consider the generating functional of a free scalar field theory in the 
absence of sources: 2 * 


: = J D<pe '■ = c 


det[3 2 + m 2 ] 


= Ce~ 2 Trl °g( 32 + m2 ) 


( 1 ) 


For the first equality we used (1.2.15) and absorbed inessential factors into the constant C. 
In the second equality we used the important identity 


det M = e Tr log M 


( 2 ) 


which you encountered in exercise 1 . 11 . 2 . 

Recall that Z = (0\e~ lHT |0) (with T —> oo understood so that we integrate over all 
of spacetime in (1)), which in this case is just e~ lET with E the energy of the vacuum. 
Evaluating the trace in (1) 

Tr 0 = Jd 4 x{x\0\x) 

=J d ‘ x I 


we obtain 

d^k 

—— log(fc 2 - m 2 + is) + A 
(2tt) 4 

where A is an infinite constant corresponding to the multiplicative factor C in (1). Recall 
that in the derivation of the path integral we had lots of divergent multiplicative factors; 
this is where they can come in. The presence of A is a good thing here since it solves a 
problem you might have noticed: The argument of the log is not dimensionless. Eet us 
define m' by writing 


iET =\VT 


1 


1 Indeed, we have already discussed one way to observe the effects of vacuum fluctuations in chapter 1.9. 

2 Strictly speaking, to render the expressions here well defined we should replace m 2 by m 2 — is as discussed 

earlier. 
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A = -kVT 


/ 


d 4 k 

(2^7 


log(& 2 — m' 2 + is) 


In other words, we do not calculate the vacuum energy as such, but only the difference 
between it and the vacuum energy we would have had if the particle had mass m' instead 
of m. The arbitrarily long time T cancels out and E is proportional to the volume of space 
V, as might be expected. Thus, the (difference in) vacuum energy density is 


E 

V 


1 

2 


f d A k k 2 — m 2 + is 

J (2n) 4 ^ k 2 — m' 2 + is 


/ 


d 5 k 

(2^)3 


[ ^ log 

co 2 — co 2 + is "J 

J 2jt 8 

1 

co 

+ 

3 

1 

CN 

3 

_i 


(3) 


where a>. = +Vk^- 


1 ' 2 . We treat the (convergent) integral over oj by integrating by parts: 


/ 


da.t dco . 

-log 

2n dco 


w 2 — co k + is 
a> 2 — co'j 2 + i s 


/ do co 

— w —>- i - 

2jt co 2 — oof: + is 


(u k 


= ~I2cd 2 ( 1 ) - (a> t co' k ) 

-2co k 

= +i(a>k - «>[) 



(4) 


Indeed, restoring h we get the result we want: 


E 

V 


r d 3 k 

J ( 2 n) 3 


(\ha> k - \hoi' k ) 


(5) 


We had to go through a few arithmetical steps to obtain this result, but the important point 
is that using the path integral formalism we have managed to obtain a result previously 
obtained using the canonical formalism. 


A peculiar sign for fermions 

Our goal is to figure out the path integral for the spinor field. Recall from chapter 11.2 that 
the vacuum energy of the spinor field comes out to have the opposite sign to the vacuum 
energy of the scalar field, a sign that surely ranks among the “top ten” signs of theoretical 
physics. How are we to get it using the path integral? 

As explained in chapter 1.3, the origin of (1) lies in the simple Gaussian integration 
formula 

[ + °° dxe-i a * 2 = J^ = V2jre-i lo ^ a 
J —00 V Cl 

Roughly speaking, we have to find a new type of integral so that the analog of the Gaussian 
integral would go something like e + 2 
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Grassmann math 


It turns out that the mathematics we need was invented long ago by Grassmann. Let 
us postulate a new kind of number, called the Grassmann or anticommuting number, 
such that if ;/ and £ are Grassmann numbers, then ;/£ = — £rj. In particular, rj 2 — 0. 
Heuristically, this mirrors the anticommutation relation satisfied by the spinor field. 
Grassmann assumed that any function ofcan be expanded in a Taylor series. Since rj 2 = 0, 
the most general function of is f(rj) — a + br /, with a and b two ordinary numbers. 

How do we define integration over rj} Grassmann noted that an essential property of 
ordinary integrals is that we can shift the dummy integration variable: dxf(x + c) — 

f_°° dxf(x). Thus, we should also insist that the Grassmann integral obey the rule 
f dr]f(r] + £) = f di)f(rj), where £, is an arbitrary Grassmann number. Plugging into the 
most general function given above, we find that f drjbc, = 0. Since £ is arbitrary this can only 
hold if we define f drib = 0 for any ordinary number b, and in particular fdr] = fdrj 1 = 0 
Since given three Grassmann numbers x, >7, and £, we have /(»j£) = that is, the 

product (/;£) commutes with any Grassmann number /, we feel that the product of two 
anticommuting numbers should be an ordinary number. Thus, the integral /drj rj is just 
an ordinary number that we can simply take to be 1: This fixes the normalization of dr}. 
Thus Grassmann integration is extraordinarily simple, being defined by two rules: 

Jdn = 0 (6) 

and 

J drm = 1 (7) 

With these two rules we can integrate any function of rj: 

J drifOi ) = J dr] (a +br/) = b (8) 

if b is an ordinary number so that firj) is Grassmannian, and 


J }) = J dt](a + br/) = -b (9) 

if b is Grassmannian so that /(/?) is an ordinary number. Note that the concept of a 
range of integration does not exist for Grassmann integration. It is much easier to master 
Grassmann integration than ordinary integration! 

Let ?; and f; be two independent Grassmann numbers and a an ordinary number. Then 
the Grassmannian analog of the Gaussian integral gives 




di]{ 1 + i]m]) = 


j dr,a, 1 = a = e +l °Z a 


( 10 ) 


Precisely what we had wanted! 

We can generalize immediately: Let r/ — (rji, rj 2 , • • • , r] N ) be N Grassmann numbers, 
and similarly for rj; we then have 
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/ dr > Jdrie' lArl — det A (11) 

for A — {A (J } an antisymmetric IV by IV matrix. (Note that contrary to the bosonic case, the 
inverse of A need not exist.) We can further generalize to a functional integral. 

As we will see shortly, we now have all the mathematics we need. 


Grassmann path integral 

In analogy with the generating functional for the scalar field 

Z = j D<pe iSW = j D<pe‘ 2 [Ov) 2 -(m 2 -ic)«. 2 ] 

we would naturally write the generating functional for the spinor field as 

Z = J Di,D^ S( *-^ = J Df f Dfe‘ S^+W-m+iW 

Treating the integration variables 1 fr and 1 Jr as Grassmann-valued Dirac spinors, we imme¬ 
diately obtain 

Z = J Dx/r J Dfe‘ = C ' det (i?- m + is) 

— C'e tofo&VP-'n+is) (12) 

where C’ is some multiplicative constant. Using the cyclic property of the trace, we note 
that (here m is understood to be m — is) 

tr log(i ft — m) = tr log y 5 (i$ — m)y 5 = tr log (—($ — m) 

= ^ [tr log(/ $ — m) + tr log(—/ $ — m)] 

= jtrlog (d 2 + m 2 ). (13) 

Thus, Z = (y' e i trl °gU 2 +'" 2 - !e ) [compare with (1)!]. 

We see that we get the same vacuum energy we obtained in chapter 11.2 using the 
canonical formalism if we remember that the trace operation here contains a factor of 
4 compared to the trace operation in (1), since (i$ — m) is a 4 by 4 matrix. 

Heuristically, we can now see the necessity for Grassmann variables. If we were to treat 
i/r and 1 ]/ as complex numbers in ( 12 ), we would obtain something like (l/det[i^ — m]) — 
e -tr logo ?-'») an d go have the wrong sign for the vacuum energy. We want the determinant 
to come out in the numerator rather than in the denominator. 


Dirac propagator 

Now that we have learned that the Dirac field is to be quantized by a Grassmann path 
integral we can introduce Grassmannian spinor sources rj and f ]: 

Z(» 7 , rj)= f D ^ D ^ e > 


( 14 ) 
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k 



- * - 1 - < - 1 - *■ 

P P + k P 

Figure II.5.1 


and proceed pretty much as before. Completing the square just as in the case of the scalar 
field, we have 

\j/K\[r + i) \[r + i ]/ij = ({[/ + rjK~ 1 )K(ip + K _1 r]) — (15) 

and thus 

Z(ri,rj) = C"e- m ?- mrlrl (16) 

The propagator S(jc) for the Dirac field is the inverse of the operator (i — m): in other 
words, S(x) is determined by 

(i$/-m)S(x) = 8 (4 \x) (17) 


As you can verify, the solution is 


iS(x) = 


/ 


d 4 p 


p-'PX 


(2n) A p — m + is 
in agreement with (II.2.22). 


(18) 


Feynman rules for fermions 

We can now derive the Feynman rules for fermions in the same way that we derived 
the Feynman rules for a scalar field. For example, consider the theory of a scalar field 
interacting with a Dirac field 

L = xfriiy^d^ - m)%lr + ^[(3 <p) 2 - H 2 <p 2 ] - 2.<p 4 + f<p4nk (19) 

The generating functional 

Z{r), fj, J) — J (20) 

can be evaluated as a double series in the couplings X and /. The Feynman rules (not 
repeating the rules involving only the boson) are as follows: 

1. Draw a diagram with straight lines for the fermion and dotted lines for the boson, and label 
each line with a momentum, for example, as in figure II.5.1. 
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2. Associate with each fermion line the propagator 


fb — m + is p 2 — m 2 + is 

3. Associate with each interaction vertex the coupling factor if and the factor (27r) 4 <5 <4) (J] in p 
— ^out P ) expressing momentum conservation (the two sums are taken over the incoming 
and the outgoing momenta, respectively). 

4. Momenta associated with internal lines are to be integrated over with the measure 
f[d A p/( 2?r) 4 ]. 

5. External lines are to be amputated. For an incoming fermion line write u(p, s) and for 
an outgoing fermion line ii(p', s'). The sources and sinks have to recognize the spin 
polarization of the fermion being produced and absorbed. [For antifermions, we would have 
v(p, s) and v(p', s'). You can see from (II.2.10) that an outgoing antifermion is associated 
with v rather than f.] 

6 . A factor of (—1) is to be associated with each closed fermion line. The spinor index carried 
by the fermion should be summed over, thus leading to a trace for each closed fermion line. 
[For an example, see (II.7.7-9).] 


Note that rule 6 is unique to fermions, and is needed to account for their negative 
contribution to the vacuum energy. The Feynman diagram corresponding to vacuum 
fluctuation has no external line. I will discuss these points in detail in chapter IV.3. 

For the theory of a massive vector field interacting with a Dirac field mentioned in 
chapter II.l 

C = - ieAJ - m]f - \F llv F> lv + \p 2 A^ (22) 


the rules differ from above as follows. The vector boson propagator is given by 



(23) 


and thus each vector boson line is associated not only with a momentum, but also with 
indices p. and v. The vertex (figure II.5.2) is associated with iey 11 . 



/'ey ! 1 


Figure II.5.2 
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If the vector boson line in figure II.5.2 is external and on shell, we have to specify its 
polarization. As discussed in chapter 1.5, a massive vector boson has three degrees of 
polarizations described by the polarization vector s (a) for a — 1,2,3. The amplitude for 

ll 

emitting or absorbing a vector boson with polarization a is iey^e^ = ie 

In Schwinger’s sorcery, the source for producing a vector boson J^(x), in contrast to 
the source for producing a scalar meson J(x), carries a Lorentz index. Work in momen¬ 
tum space. Current conservation k^J^ik) — 0 implies that we can decompose J^ik) — 
Y,“ =1 J (a Hk)e ( °Hk). The clever experimentalist sets up her machine, that is, chooses the 
functions J (a> (k), so as to produce a vector boson of the desired momentum k and polar¬ 
ization a. Current conservation requires k^s^ a \k) — 0. For k 11 — (co{k), 0, 0, k), we could 
choose 

e (1) (jt) = (0,1, 0, 0), g (2) (fc) = (0, 0, 1, 0), e O) (k) = (-k,0,0,w(k))/m (24) 

/A {A IL 

In the canonical formalism, we have in analogy with the expansion of the scalar field (p 
in ( 1 . 8 . 11 ) 

A Ax, 0 = f , d ° k (25) 

J J(2n) D 2a) k M M 

(I trust you not to confuse the letter a used to denote annihilation and used to label 
polarization.) The point is that in contrast to ip, A M carries a Lorentz index, which the 
creation and annihilation operators have to “know about” (through the polarization label.) 
It is instructive to compare with the expansion of the fermion field 1 jr in (II.2.10): the spinor 
index a on is carried in the expansion by the spinors u{p, s ) and v(p, s). In each case, 
an index (p. in the case of the vector and a in the case of the spinor) known to the Lorentz 
group is “traded” for a label specifying the spin polarization (a and ^ respectively.) 

A minor technicality: notice that I have complex conjugated the polarization vector 
associated with the creation operator a (a h(k) j n (25) even though the polarization vectors in 
(24) are real. This is because experimentalists sometimes enjoy using circularly polarized 
photons with polarization vectors s^\k) — ( 0 , 1 , i, 0 )/\/ 2 , e^(k) — ( 0 , 1 , —I, 


k 
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Exercises 


11.5.1 Write down the Feynman amplitude for the diagram in figure II.5.1 for the scalar theory (19). The answer 

is given in chapter 111.3. 


II.5.2 Applying the Feynman rules for the vector theory (22) show that the amplitude for the diagram in 
figure 11.5.3 is given by 


{ie) 2 i 2 J 


d A k 


(2tt) 4 k 2 — fi 2 \ \x 2 


kfxkv \ , v ft + 1/ + tn .j 

-gnv)“(P)Y ~——5- tY^ u (p) 


(p + k ) 2 — m 2 


( 26 ) 




Electron Scattering and Gauge Invariance 


Electron-proton scattering 

We will now finally calculate a physical process that experimentalists can go out and 
measure. Consider scattering an electron off a proton. (For the moment let us ignore the 
strong interaction that the proton also participates in. We will learn in chapter III.6 how 
to take this fact into account. Here we pretend that the proton, just like the electron, is a 
structureless spin- \ fermion obeying the Dirac equation.) To order e 2 the relevant Feynman 
diagram is given in figure II.6.1 in which the electron and the proton exchange a photon. 

But wait, from chapter 1.5 we only know how to write down the propagator i D^ v = 
( (~pr — *lfiv) / (k 2 — p 1 ) for a hypothetical massive photon. (Trivial notational change: the 
mass of the photon is now called p, since m is reserved for the mass of the electron and 
M for the mass of the proton.) In that chapter I outlined our philosophy: we will plunge 
ahead and calculate with a nonzero ix and hope that at the end we can set p to zero. Indeed, 
when we calculated the potential energy between two external charges, we find that we can 
let fx -> 0 without any signs of trouble [see (1.5.6)]. In this chapter and the next, we would 
like to see whether this will always be the case. 

Applying the Feynman rules, we obtain the amplitude for the diagram in figure II.6.1 
(with k = P — p the momentum transfer in the scattering) 

M(P, P N ) = (~ie)(ie)— - l — -- (\ a(P)y^u(p)u(P N )y v u(p N ) (1) 

We have suppressed the spin labels and used the subscript N (for nucleon) to refer to the 
proton. 

Now notice that 

k^ipPly^uip) = (P — p) /M u(P)y >l u^p) = u(P)(f > — p)u(p) = u(P)(m — m)u(p) — 0 (2) 

by virtue of the equations of motion satisfied by ii ( P ) and u (p) . Similarly, k^ l u(P N )y ll u(p N ) 
= 0. 
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This important observation implies that the k^k v /p 2 term in the photon propagator does 
not enter. Thus 

M(P, P N ) = ~ie 2 — - 1 - -u(P)y lx u(p)u(P N )y u(p N ) (3) 

(P-pY-p 2 

and we can now set the photon mass p to zero with impunity and replace (P — p ) 2 — p 2 
in the denominator by (P — p) 2 . 

Note that the identity that allows us to set p to zero is just the momentum space version 
of electromagnetic current conservation d^J 11 = 3^(4 Iry^x/f) — 0. You would notice that 
this calculation is intimately related to the one we did in going from (1.5.4) to (1.5.5), with 
u(P)y IJ 'u(p) playing the role of J^(k). 


Potential scattering 


That the proton mass M is so much larger than the electron mass in allows us to make 
a useful approximation familiar from elementary physics. In the limit M/m tending to 
infinity, the proton hardly moves, and we could use, for the proton, the spinors for a particle 
at rest given in chapter II.2, so that u(P N )y 0 u(p N ) & 1 and u{P N )yiu{p N ) % 0. Thus 

-ie 2 

M= —j-u(P)y°it(p) (4) 

We recognize that we are scattering the electron in the Coulomb potential generated by 
the proton. Workout the (familiar) kinematics: p — (E ,0,0, |p|) and P = (E, 0, \p\ sin 6, 
\p\ cos 9). We see that k = P — p is purely spacelilce and k 2 = — k 2 = —4\p\ 2 sin 2 (0/2). 
Recall from (1.4.7) that 


/ 


d y xe ik '* 




(5) 


We represent potential scattering by the Feynman diagram in figure II.6.2: the proton has 
disappeared and been replaced by a cross, which supplies the virtual photon the electron 
interacts with. It is in this sense that you could think of the Coulomb potential picturesquely 
as a swarm of virtual photons. 
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Figure II.6.2 


Once again, it is instructive to use the canonical formalism to derive this expression for 
A4 for potential scattering. We want the transition amplitude (P, S\ e~ ,HT \p, s) with the 
single electron state \p, s) = b f (p, s) |0). The term in the Lagrangian describing the elec¬ 
tron interacting with the external c-number potential A p (x) is given in (II.5.22) and thus 
to leading order we have the transition amplitude ie f d 4 x(P, ,S| \[r{x)y^\l/(x) \p, s)A fl (x). 
Using (II.2.10-11) we evaluate this as 

ie f d 4 x{l/p(PMl/p{p)){u(P, S)y ll u(p, s))e i(P ~P )x A^x) 

Here p(p) denotes the fermion normalization factor J(In^Ep/m in (II.2.10). Given that 
the Coulomb potential has only a time component and does not depend on time, we see 
that integration over time gives us an energy conservation delta function, and integration 
over space the Fourier transform of the potential, as in (5). Thus the above becomes 
{1/p(P))(l/p(p))(2jr)8(Ep — E p )( = j^)u(P, S)y°u(p, s). Satisfyingly, we have recovered 
the Feynman amplitude up to normalization factors and an energy conservation delta 
function, just as in (1.8.16) except for the substitution of boson for fermion normalization 
factors. Notice that we have energy conservation but not 3-momentum conservation, a fact 
we understand perfectly well when we dribble a basketball ball, for example. 


Electron-electron scattering 

Next, we graduate to two electrons scattering off each other: e~(p 1 ) + e~(p 2 ) -* e~{P\) + 
e~(P 2 ). Here we have a new piece of physics: the two electrons are identical. A profound 
tenet of quantum physics states that we cannot distinguish between the two outgoing 
electrons. Now there are two Feynman diagrams (see fig. II.6.3) to order e 2 , obtained 
by interchanging the two outgoing electrons. The electron carrying momentum P 1 could 
have “come from” the incoming electron carrying momentum p 1 or the incoming electron 
carrying momentum p 2 - 

We have for figure II.6.3a the amplitude 

A(P\, P 2 ) = (ie 2 /(Pi - p 1 ) 1 )u(P 1 )y ll u(p 1 )u(P2)ypU(p2) 
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P 2 


Pi 

(a) 


Figure II.6.3 


(b) 


Pi 


as before. We have only indicated the dependence of A on the final momenta, suppressing 
the other dependence. By Fermi statistics, the amplitude for the diagram in figure II.6.3 
is then —A(P 2 , P{). Thus the invariant amplitude for two electrons of momentum p^ and 
p 2 to scatter into two electrons with momentum P 1 and P 2 is 

M = A(P lt P 2 ) - A(P 2 , Pp (6) 



To obtain the cross section we have to square the amplitude 
\M\ 2 = [\A(P 1: P 2 )| 2 + (/>! *> P 2 )} - 2 Re A(P 2 , P 1 )*A(P 1 , P 2 ) 


( 7 ) 
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At this point we have to do a fair amount of arithmetic, but keep in mind that there 
is nothing conceptually intricate in what follows. First, we have to learn to complex con¬ 
jugate spinor amplitudes. Using (II.1.15), note that in general (u{p')y jJ ■ ■ ■ y v u(p))* — 
u(p) f yj ■ ■ ■ y l ]y () u(p') — u(p)y v ■ ■ ■ y^u(p'). Here y /jL ■ ■ ■ y v represents a product of any 
number of y matrices. Complex conjugation reverses the order of the product and inter¬ 
changes the two spinors. Thus we have 

\HP\, Pi)\ 2 = -j^[u(Pi)Y^u(p 1 )u(p l )y v u(P 1 )\[u{P 1 )Y^u(p2)u(p2)Y v u{P2)] ( 8 ) 

which factorizes with one factor involving spinors carrying momentum with subscript 1 
and another factor involving spinors carrying momentum with subscript 2. In contrast, 
the interference term A(P 2 , Pi)*A(P x , P 2 ) does not factorize. 

In the simplest experiments, the initial electrons are unpolarized, and the polarization 
of the outgoing electrons is not measured. We average over initial spins and sum over final 
spins using (II.2.8): 

1 ) -h 171 

u(p, s)u(p, s) = — - (9) 

s 2m 

In averaging and summing | A ( P x , P 2 ) \ 2 we encounter the object (displaying the spin labels 
explicitly) 

r^iPi, Pi) = ^Y^2,u{P x , S)y^ l u(p 1 , s)u(p h s)y v u(P 1 , S) 

= -w-, 1 s 2 tr ^i + '^Y^iPi + m)y v 
2(2 m) z 

which is to be multiplied by r^ lv (P 2 , Pi)- 

Well, we, or rather you, have to develop some technology for evaluating the trace of 
products of gamma matrices. The key observation is that the square of a gamma matrix 
is either +1 or —1, and different gamma matrices anticommute. Clearly, the trace of a 
product of an odd number of gamma matrices vanish. Furthermore, since there are only 
four different gamma matrices, the trace of a product of six gamma matrices can always be 
reduced to the trace of a product of four gamma matrices, since there are always pairs of 
gamma matrices that are equal and can be brought together by anticommuting. Similarly 
for the trace of a product of an even higher number of gamma matrices. 

Hence r^ V (P 1( p x ) = I ^i(tr(/’i yp p x y v ) + m 2 tr(k'V0). Writing tr(/> 1 y' J p 1 y v ) = 
P\ p P\x^{y p Y^Y X y v ) and using the expressions for the trace of a product of an even 
number ofgamma matrices listed in appendix D, weobtain r M ' ; (7 3 1 , p x ) — 2[2m) i ^(P\ ' P\ ~~ 

ri^Pi- Pi + P x Pi +mV v ). 

In averaging and summing A(P 2 , P x )*A(Pi, P 2 ) we encounter the more involved object 

*^EEEE u(P 1 )y^u(p 1 )u(P 2 )y ll u(p 2 )u(p 1 )y v u(P 2 )u{p 2 )y v u(P 1 ) (12) 

where for simplicity of notation we have suppressed the spin labels. Applying (9) we can 
write k as a single trace. The evaluation of k is quite tedious, since it involves traces of 
products of up to eight gamma matrices. 
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We will be content to study electron-electron scattering in the relativistic limit in which 
m may be neglected compared to the momenta. As explained in chapter II. 2, while we 
are using the “rest normalization” for spinors we can nevertheless set m to 0 wherever 
possible. Then 

K = (13) 

Applying the identities in appendix D to (13) we obtain tr(/ > 1 y / ' p 1 y v P 2 Yv) = 
—2trC/ , 1 K M Pi p2Y/_i P 2 ) = - 32 Ti ’ Pi p \ ‘ P 2- 

In the same limit T tlv (P 1 , pQ = Q^p.( p \ P\ + P \P\ ~ p \ • Pi) and thus 

r^Vi. Pi)t^(P 2 , Pi) = t^—AKpI + p i Pi ~ '/ Vp i • Pi)V p 2„P2 V ~ % V P 2 • P 2 ) (14) 

{Amy 

— — —tOa • Pi A ■ p i + Pi ■ p iP 2 ■ ^ 1 ) (15) 

{2m r 

An amusing story to break up this tedious calculation: Murph Goldberger, who was 
a graduate student at the University of Chicago after working on the Manhattan Project 
during the war and whom I mentioned in chapter II.1 regarding the Feynman slash, told 
me that Enrico Fermi marvelled at this method of taking a trace that young people were 
using to sum over spin-j polarizations. Fermi and others in the older generation had 
simply memorized the specific form of the spinors in the Dirac basis (which you know from 
doing exercise II. 1.3) and consequently the expressions for u(P, S)y fl u(p, s). They simply 
multiplied these expressions together and added up the different possibilities. Fermi was 
skeptical of the fancy schmancy method the young Turks were using and challenged Murph 
to a race on the blackboard. Of course, with his lightning speed, Fermi won. To me, it is 
amazing, living in the age of string theory, that another generation once regarded the 
trace as fancy math. I confessed that I was even a bit doubtful of this story until I looked at 
Feynman’s book Quantum Electrodynamics, but guess what, Feynman indeed constructed, 
on page 100 in the edition I own, a table showing the result for the amplitude squared for 
various spin polarizations. Some pages later, he mentioned that polarizations could also 
be summed using the spur (the original German word for trace). Another amusing aside: 
spur is cognate with the English word spoor, meaning animal droppings, and hence also 
meaning track, trail, and trace. All right, back to work! 

While it is not the purpose of this book to teach you to calculate cross sections for a 
living, it is character building to occasionally push calculations to the bitter end. Here 
is a good place to introduce some useful relativistic kinematics. In calculating the cross 
section for the scattering process pi + p 2 -> p \ + p 2 (with the masses of the four particles 
all different in general) we typically encounter Forentz invariants such as pi ■ P 2 - A priori, 
you might think there are six such invariants, but in fact, you know that there are only 
physical variables, the incident energy E and the scattering angle 6. The cleanest way to 
organize these invariants is to introduce what are called Mandelstam variables: 

s = (.Pi + P 2 ) 2 — (Pi + P 2 ) 2 

t = (P 1 - Pl ) 2 = (P 2 - Pl f 

u = (P 2 - Pl ) 2 = (P 1 - Pl ) 2 


(16) 

(17) 

(18) 
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You know that there must be an identity reducing the three variables s, t, and u to two. 
Show that (with an obvious notation) 


s Y T ii — 111 1 “f" m 2 “I - ” 1 ” 


(19) 


For our calculation here, we specialize to the center of mass frame in the relativistic limit 
Pi — E{\, 0, 0,1), p 2 = E(l, 0, 0, — 1), Pi = E{\, sin$, 0, cos 9), and P 2 = E{1, — sin 9, 0, 
— cos 9). Hence 


Pi ■ Pi = p \ ■ p 2 = 2 £ 2 = 

Pi p i = Pi' p i = 2 £ 2 sin 2 6 - = -\t 


( 20 ) 

( 21 ) 


and 


Pi ■ p 2 = Pi' p \ = 2 E cos - = --u 


( 22 ) 


Also, in this limit (Pj — p{) A — (—2 p x ■ P{ ) 2 = 16P 4 sin 4 (0/2) = f 4 . Putting it together, we 
obtain = (e 4 /4m 4 ) f (9), where 


m = 


s 2 + u 2 


2s 2 

+ — + 
tu 


s 2 + t 2 


5 4 + / 4 + i ( 4 
t 2 U 2 


1 + cos 4 ( 0 / 2 ) 


= 2 


2 1 +sin 4 ( 0 / 2 ) 

sin 4 ( 0 / 2 ) ^ sin 2 ( 0 / 2 ) cos 2 ( 0 / 2 ) + cos 4 ( 0 / 2 ) 

1 _ 1 


sin 4 ( 0 / 2 ) 


+ 1 + 


cos 4 ( 0 / 2 ) 


(23) 

(24) 


The physical origin of each of the terms in (23) [before we simplify with trigonometric 
identities] to get to (24) is clear. The first term strongly favors forward scattering due 
to the photon propagator ~ 1/k 2 blowing up at k ~ 0. The third term is required by the 
indistinguishability of the two outgoing electrons: the scattering must be symmetric under 
9 —»• 7 r — 0 , since experimentalists can’t tell whether a particular incoming electron has 
scattered forward or backward. The second term is the most interesting of all: it comes from 
quantum interference. If we had mistakenly thought that electrons are bosons and taken 
the plus sign in ( 6 ), the second term in f(9) would come with a minus sign. This makes 
a big difference: for instance, f(n/2) would be 5 — 8 + 5 = 2 instead of 5 + 8 + 5 = 18. 

Since the conversion of a squared probability amplitude to a cross section is conceptually 
the same as in nonrelativistic quantum mechanics (divide by the incoming flux, etc.), I 
will relegate the derivation to an appendix and let you go the last few steps and obtain the 
differential cross section as an exercise: 

210 ^ f.Q, 

dO. 8 E 2 


with the fine structure constant a = e 2 /4n & 1/137. 


( 25 ) 
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An amazing subject 

When you think about it, theoretical physics is truly an amazing business. After the 
appropriate equipments are assembled and high energy electrons are scattered off each 
other, experimentalists indeed would find the differential cross section given in (25). There 
is almost something magical about it. 


Appendix: Decay rate and cross section 


To make contact with experiments, we have to convert transition amplitudes into the scattering cross sections 
and decay rates that experimentalists actually measure. I assume that you are already familiar with the physical 
concepts behind these measurements from a course on nonrelativistic quantum mechanics, and thus here we 
focus more on those aspects specific to quantum field theory. 

To be able to count states, we adopt an expedient probably already familiar to you from quantum statistical 
mechanics, namely that we enclose our system in a box, say a cube with length L on each side with L much larger 
than the characteristic size of our system. With periodic boundary conditions, the allowed plane wave states e ip ‘ x 
carry momentum 


P = — («V. n y , n z ) (26) 

where the rif s are three integers. The allowed values of momentum form a lattice of points in momentum space 
with spacing 2n /L between points. Experimentalists measure momentum with finite resolution, small but much 
larger than 2tt /L. Thus, an infinitesimal volume d 3 p in momentum space contains d^p/{2n /L) 3 = Vd^p/(2n)^ 
states with V = L 3 the volume of the box. We obtain the correspondence 


/ 


d}p 

( 2^)3 


Hp)** 


1 

v 


Z/(p) 


(27) 


In the sum the values of p ranges over the discrete values in (26). The correspondence (27) between continuum 
normalization and the discrete box normalization implies that 


S°\p - p') 7 ^ 7 -j rpl (28) 

(2n) s 

with the Kronecker delta 8pp equal to 1 if p = p' and 0 otherwise. One way of remembering these correspondences 
is simply by dimensional matching. 

Let us now look at the expansion (1.8.17) of a complex scalar field 


(p(x, t) = f dk + (29) 

J y/ (2jt) 3 2COl c 

in terms of creation and annihilation operators. Henceforth, in order not to clutter up the page, I will abuse 
notation slightly, for example, dropping the arrows on vectors when there is no risk of confusion. Going over to 
the box normalization, we replace the commutation relation [a(k), a^(k')\ = 8^\k — k') by 


[a{k),aHk')]=-^—* u , (30) 

(Iny 

We now normalize the creation and annihilation operators by 



Cl(k) 


a(k) = 


(31) 
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so that 

[a(k),aHk')] = h,k' (32) 

Thus the state |&) = a^(k) |0) is properly normalized: (k\k) = 1. Using (27) and (31), we end up with 

<p(x) = V 4= ^ e ~ ikX + ^ e ‘ kX ) (33) 

VI k 42 a) k 

We specified a complex, rather than a real, scalar field because then, as you showed in exercise 1.8.4, a 
conserved current can be defined with the corresponding charge Q = f d i xJ° = f d*k(a^(k)a(k) — b^(k)b(k))—> 
^ k (a4(k)a(k) — b^(k)b(k)). Itfollows immediately that (k\ Q \k) = 1, so that for the state \k) we have one particle 
in the box. 

To derive the formula for the decay rate, we focus, for the sake of pedagogical clarity, on a toy Lagrangian 
C = + h.c.) describing the decay (p —> q + £ of a meson into two other mesons. (As usual, we display 

only the part of the Lagrangian that is of immediate interest. In other words, we suppress the stuff you have long 
since mastered: C = d(p^d(p — + • • • and all the rest.) 

The transition amplitude {p,q\ e~ lHT \k) is given to lowest order by A = i{p, q\ f d*x{gv4{x)£4(x)<p{x)) \k ). 
Here we use the states we “carefully” normalized above, namely the ones created by the various “analogs” of a 
(Just as in quantum mechanics, strictly, we should use wave packets instead of plane wave states. I assume that 
you have gone through that at least once.) Plugging in the various “analogs” of (33), we have 


A = A) 3 J2J2J2 /7 n J 

V 2 V J y y/2(D p ,2co q ,2a) k , 


= '8 


p' q' k! 

1 


51 aWuoMtm 


v\ ^/2(D p 2a) q 2(D k 


(2jr) 4 5 <4) (p + q — k) 


(34) 


Here we have committed various minor transgressions against notational consistency. For example, since 
the three particles (p , r], and £ have different masses, the symbol co represents, depending on context, different 
functions of its subscript (thus co p = Jp 2 + m^, and so forth). Similarly, a(k') should really be written as a^k'), 
and so forth. Also, we confound 3- and 4-momenta. I would like to think that these all fall under the category of 
what the Catholic church used to call venial sins. In any case, you know full well what I am talking about. 

Next, we square the transition amplitude A to find the transition probability. You might be worried, because 
it appears that we will have to square the Dirac delta function. But fear not, we have enclosed ourselves in a box. 
Furthermore, we are in reality calculating (p, q\e~ lHT \k), the amplitude for the state | k) to become the state 
| p,q) after a large but finite time T. Thus we could in all comfort write 


[(27r) 4 s (4> (p + q — k)f = (27r) 4 c5 <4) (p + q - k) J 

= (27r) 4 5 <4) (p + q - k) J 


d \ xe i(p+q-k)x 

d 4 x = (27r) 4 <5 (4) (p + q - k)VT 


(35) 


Thus the transition probability per unit time, aka the transition rate, is equal to 


Ml 2 

T 


V_ 

U3 


2(D p 2(D q 2(D k 


(27T) 4 5 <4) (/7 + q 


k)g 2 


(36) 


Recall that there are Trf 3 p/(27r) 3 states in the volume d ‘p in momentum space. Hence, multiplying the number 
of final states {V ^ p/(2 tt) 3 )(V d 3 q / (27 t) 3 ) by the transition rate \A\ 2 /T, we obtain the differential decay rate of 
a meson into two mesons carrying off momenta in some specified range d 3 p and d ’’q : 


dT=— — 

2co k U 3 


_rf 3 p_\ 

(2n) i 2a> p ) 


d^q \ 
(27r) 3 2&> ? J 


(27r) 4 S <4) (p + q 


k)g 2 


(37) 


Yes sir, indeed, the factors of V cancel, as they should. 

To obtain the total decay rate T we integrate over d?p and d^q. Notice the factor 1/2 co k \ the decay rate for a 
moving particle is smaller than that of a resting particle by a factor m/co k . We have derived time dilation, as we 
had better. 



11 . 6 . Scattering and Gauge Invariance | 141 


We are now ready to generalize to the decay of a particle carrying momentum P into n particles carrying 
momenta k\, • • •, k n . For definiteness, we suppose that these are all Bose particles. First, we draw all the relevant 
Feynman diagrams and compute the invariant amplitude A4. (In our toy example, A4 = ig.) Second, the transition 
probability contains a factor 1/ V n+1 , one factor of 1/ V for each particle, but when we squared the momentum 
conservation delta function we also obtained a factor of VT, which converts the transition probability into a 
transition rate and knocks off one power of V, leaving the factor 1/ V n . Next, when we sum over final states, we 
have a factor Vd^kj{{2n)^2a) k .) for each particle in the final state. Thus the factors of V indeed cancel. 

The differential decay rate of a boson of mass M in its rest frame is thus given by 


dr = 


1 d\ 

2M (2jr) 3 2 a)(ki) 


d i k„ 


(2tt) 3 2 w(k n ) 


(2;r) 4 S (4) 


P ~E k ‘ 

i =1 


\M\ 2 


(38) 


At this point, we recall that, as explained in chapter 11.2, in the expansion of a fermion field into creation and 
annihilation operators [see (11.2.10)], we have a choice of two commonly used normalizations, trivially related 
by a factor (2m) 2 . If you choose to use the “rest normalization" so that spinors come out nice in the rest frame, 
then the field expansion contains the normalization factor ( E p /m )z instead of the factor (2co k )^ for a Bose 
field [see (1.8.11)]. This entails the trivial replacement, for each fermion, of the factor 2 a>(k) = l\Jk 2 + m 2 by 
E(p)/m = \!p 2 + m 2 /m. In particular, for the decay rate of a fermion the factor 1/2M should be removed. If 
you choose the “any mass renormalization,” you have to remember to normalize the spinors appearing in JA 
correctly, but you need not touch the phase space factors derived here. 

We next turn to scattering cross sections. As I already said, the basic concepts involved should already be 
familiar to you from nonrelativistic quantum mechanics. Nevertheless, it may be helpful to review the basic 
notions involved. For the sake of definiteness, consider some happy experimentalist sending a beam of hapless 
electrons crashing into a stationary proton. The flux of the beam is defined as the number of electrons crossing 
an imagined unit area per unit time and is thus given by F = nv, where n and v denote the density and velocity 
of the electrons in the beam. The measured event rate divided by the flux of the beam is defined to be the cross 
section a , which has the dimension of an area and could be thought of as the effective size of the proton as seen 
by the electrons. 

It may be more helpful to go to the rest frame of the electrons, in which the proton is plowing through the 
cloud of electrons like a bulldozer. In time At the proton moves through a distance v At and thus sweeps through 
a volume avAt, which contains ncrvAt electrons. Dividing this by At gives us the event rate nva. 

To measure the differential cross section, the experimentalist sets up, typically in the lab frame in which the 
target particle is at rest, a detector spanning a solid angle dQ = sin 6d6d(p and counts the number of events per 
unit time. 

All of this is familiar stuff. Now we could essentially take over our calculation of the differential decay rate 
almost in its entirety to calculate the differential cross section for the process p 1 + p 2 ->k 1 + k 2 +--- + k n . With 
two particles in the initial state we now have a factor of (1/ V) n+2 in the transition probability. But as before, the 
square of the momentum conservation delta function produces one power of V and counting the momentum 
final states gives a factor V n , so that we are left with a factor of 1/ V. You might be worried about this remaining 
factor of 1/ V, but recall that we still have to divide by the flux, given by |tq — v 2 \n. Since we have normalized to 
one particle in the box the density n is 1/ V . Once again, all factors of V cancel, as they must. 

The procedure is thus to draw all relevant diagrams to the order desired and calculate the Feynman amplitude 
JA for the process p 1 + p 2 ► k\ + k 2 + • • • + k n . Then the differential cross section is given by (again assuming 
all particles to be bosons) 


da = 


1 d% 

\vi — V2\2u>{pi)2o)(p 2 ) (27r) 3 2 Q)(k-{) 


d?k n 

(2n) 3 2a> (k n ) 


(2jr) 4 S <4> 


yPl + P 2 


i^i 2 

1=1 / 

(39) 


We are implicitly working in a collinear frame in which the velocities of the incoming particles, tq and v 2 , 
point in opposite directions. This class of frames includes the familiar center of mass frame and the lab frame (in 
which v 2 =0). In a collinear frame, pi = ZT-^l, 0, 0, iq) and p 2 = E 2 { 1, 0, 0, v 2 ), and a simple calculation shows 
that ((P 1 P 2) 2 — m 2 m 2 ) = (EiE 2 (v 1 — v 2 )) 2 . We could write the factor |iq — v 2 \EiE 2 in da in the more invariant¬ 
looking form ((P 1 P 2) 2 — m 2 m 2 )^, thus showing explicitly that the differential cross section is invariant under 
Lorentz boosts in the direction of the beam, as physically must be the case. 
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An often encountered case involves two particles scattering into two particles in the center of mass frame. Let 
us do the phase space integral f (d 3 k 1 /2co 1 )(d 3 k 2 )2co 2 )8^ A \P — Aq — k 2 ) here for easy reference. We will do it in 
two different ways for your edification. 

We could immediately integrate over d 3 k 2 thus knocking out the 3-dimensional momentum conservation 
delta function S 3 (ki + k 2 ). Writing d 3 k\ = k\dk\dQ., we integrate over the remaining energy conservation delta 

function 8 {Jk^ + + Jk\ — £total)- Using (1.2.13), we find that the integral over k 1 gives kiC 0 iC 0 2 /E tota \, 

where co^ = Jk\ and co 2 = Jk^ + ra^, with k j determined by Jk \ + m \+M + = E tota \. Thus we obtain 


J 2 (0\ 2(i>2 J 


(40) 


Once again, if you use the “rest normalization" for fermions, remember to make the replacement as explained 
above for the decay rate. The factor of ^ should be replaced by my/2 for one fermion and one boson, and by 
m\m 2 for two fermions. 

Alternatively, we use (1.8.14) and regressing, write d 3 k 2 = f d A k 2 6 {k 2 ) 8 (k\ — m^)2o) 2 . Integrate over d A k 2 and 
knock out the 4-dimensional delta function, leaving us with J 2co 2 dkik^dQ,/(2coi2a) 2 )8((P — k {) 2 — m^). The 
argument of the delta function is £ t 2 otai — 2 E tota \ki + m\ — m^, and thus integrating over ki we get a factor of 
2 Zs total hi the denominator, giving a result in agreement with (40). 

For the record, you could work out the kinematics and obtain 


h = /(Aotal “ (M 1 + ,M 2) 2 K £ 't 2 otal “ (,77 1 “ ,M 2) 2 )/ 2£: total 


Evidently, this phase space integral also applies to the decay into two particles in the rest frame of the parent 
particle, in which case we replace Zstotal by M. In particular, for our toy example, we have 


r = + M) 2 )(M 2 - (m - M) 2 ) 

167rM J 


The differential cross section for two-into-two scattering in the center of mass frame is given by 


do 

dQ 


_1_ kj_ 

(27r) 2 |tj 1 - t5 2 |2w(Pi)2co(p 2 ) £ total 


-F|A4| 2 


(41) 


(42) 


In particular, in the text we calculated electron-electron scattering in the relativistic limit. As shown there, we 
can write |A4| 2 = |A4| 2 /(2m) 4 in terms of some reduced invariant amplitude . VI. The factor l/(2m) 4 transforms 
the factors 2im(p) into 2 E. Things simplify enormously, with |Cj — v 2 \ = 2 and fy = [, so that finally 


do 

dQ 


1 

2 4 (47r) 2 £ 2 


\M\ 2 


(43) 


Last, we come to the statistical factor S that must be included in calculating the total decay rate and the total 
cross section to avoid over-counting if there are identical particles in the final state. The factor S has nothing to do 
with quantum field theory per se and should already be familiar to you from nonrelativistic quantum mechanics. 
The rule is that if there are identical particles of type i in the final state, the total decay rate or the total cross 
section must be multiplied by S = FI/1/n, ! to account for indistinguishability. 

To see the necessity for this factor, it suffices to think about the simplest case of two identical Bose particles. 
To be specific, consider electron-positron annihilation into two photons (which we will study in chapter 11.8). For 
simplicity, average and sum over all spin polarizations. Let us calculate da /dQ. according to (43) above. This is 
the probability that a photon will check into a detector set up at angles 6 and 0 relative to the beam direction. If 
the detector clicks, then we know that the other photon emerged at an angle n — 6 relative to the beam direction. 
Thus the total cross section should be 



(The second equality is for all the elementary cases we will encounter in which da does not depend on the 
azimuthal angle 0.) In other words, to avoid double counting, we should divide by 2 if we integrate over the full 
angular range of 0 . 

More formally, we argue as follows. In quantum mechanics, a set of states |a) is complete if 1 = |a()o!| 

(“decomposition of 1”). Acting with this on \(5) we see that these states must be normalized according to 
(a\P) = 8 afi . 



11 . 6 . Scattering and Gauge Invariance | 143 


Now consider the state 


1*1. h ) 


1 

Ti 


a\h)a\k 2 ) | 0 ) = \k 2 , k x ) 


(45) 


containing two identical bosons. By repeatedly using the commutation relation (32), we compute (q\, q 2 \k\, k 2 ) = 
<0|5(?i)a(9 2 )a t (* 1 )a t (* 2 ) |0) = | (S qi k 1 &q 2 ki + V Thus £« T, qi Vlv ? 2 >ki> 02 1 * 1 , k 2 ) 

= £ ?1 ki. 02 > 5 ( s qik^q 2 k 2 + s q 2 k 1 s q 1 k 2 ) = j (1*1, * 2 > + 1 * 2 . *i» = l*i, * 2 >- Thus the states |* x , k 2 ) are nor- 
malized properly. In the sum over states, we have 1 = • • • + Ylq 2 l#i> # 2 )(#i> # 2 I + • * •• 

In other words, if we are to sum over q^ and q 2 independently, then we must normalize our states as in (45) 
with the factor of 1 /a/ 2- But then this factor would appear multiplying A4. In calculating the total decay rate 
or the total cross section, we are effectively summing over a complete set of final states. In summary, we have 
two options: either we treat the integration over d^kid^k 2 as independent in which case we have to multiply the 
integral by \, or we integrate over only half of phase space. 

We readily generalize from this factor of \ to the statistical factor S. 

In closing, let me mention two interesting pieces of physics. 

To calculate the cross section o , we have to divide by the flux, and hence o is proportional to l/\vi — v 2 \. 
For exothermal processes, such as electron-positron annihilation into photons or slow neutron capture, o could 
become huge as the relative velocity v Ye \ —> 0. Fermi exploited this fact to great advantage in studying nuclear 
fission. Note that although the cross section, which has dimension of an area, formally goes to infinity, the 
reaction rate (the number of reactions per unit time) remains finite. 

Positronium decay into photons is an example of a bound state decaying in finite time. In positronium, the 
positron and electron are not approaching each other in plane wave states, as we assumed in our cross section 
calculation. Rather, the probability (per unit volume) that the positron finds itself near the electron is given by 
| ^( 0 ) 1 2 according to elementary quantum mechanics, with the bound state wave function for whatever 
state of positronium we are interested in. In other words, \\J/(0)\ 2 gives the volume density of positrons near the 
electron. Since vcr is a volume divided by time, the decay rate is given by T = va\\J/(0)\ 2 . 


Exercises 


11.6.1 


Show that the differential cross section for a relativistic electron scattering in a Coulomb potential is 
given by 


da 

dQ 


4p 2 v 2 sin 4 (0 /2) 


(1 — t> 2 sin 2 (0/2)). 


Known as the Mott cross section, it reduces to the Rutherford cross section you derived in a course on 
quantum mechanics in the limit the electron velocity v —> 0 . 


1 1 . 6.2 To order e 2 the amplitude for positron scattering off a proton is just minus the amplitude (3) for electron 
scattering off a proton. Thus, somewhat counterintuitively, the differential cross sections for positron 
scattering off a proton and for electron scattering off a proton are the same to this order. Show that to 
the next order this is no longer true. 


11. 6.3 Show that the trace of a product of odd number of gamma matrices vanishes. 

11. 6.4 Prove the identity s + t + u = m 2 . 

11. 6.5 Verify the differential cross section for relativistic electron electron scattering given in (25). 


11.6.6 For those who relish long calculations, determine the differential cross section for electron-electron 
scattering without taking the relativistic limit. 


II. 6.7 Show that the decay rate for one boson of mass M into two bosons of masses m and /x is given by 

r = TT^rWW 2 - ('» + M) 2 )(M 2 - (m - M) 2 ) 

167rM 3 




Diagrammatic Proof of Gauge Invariance 


Gauge invariance 

Conceptually, rather than calculate cross sections, we have the more important task of 
proving that we can indeed set the photon mass /x equal to zero with impunity in calculating 
any physical process. With /x = 0, the Lagrangian given in chapter II. 1 becomes the 
Lagrangian for quantum electrodynamics: 

£ = - ieAJ - m]f - ( 1 ) 

We are now ready for one of the most important observations in the history of theoretical 
physics. Behold, the Lagrangian is left invariant by the gauge transformation 

f(x) e iMx) f(x) (2) 

and 

-*• A /x« + -e-' AU V' AW = A^x) + -3„A(*) (3) 

ie e 

which implies 

F^x) (4) 

You are of course already familiar with (3) and the invariance of F llv from classical 
electromagnetism. 

In contemporary theoretical physics, gauge invariance 1 is regarded as fundamental and 
all important, as we will see later. The modern philosophy is to look at (1) as a consequence 
of (2) and (3). If we want to construct a gauge invariant relativistic field theory involving a 
spin \ and a spin 1 field, then we are forced to quantum electrodynamics. 

1 The discovery of gauge invariance was one of the most arduous in the history of physics. Read J. D. Jackson 
and L. B. Okun, “Historical roots of gauge invariance," Rev. Mod. Phys. 73, 2001 and learn about the sad story of 
a great physicist whose misfortune in life was that his name differed from that of another physicist by only one 
letter. 
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You will notice that in (3) I have carefully given two equivalent forms. While it is simpler, 
and commonly done in most textbooks, to write the second form, we should also keep the 
first form in mind. Note that A(v) and Afx) + lit give exactly the same transformation. 
Mathematically speaking, the quantities e' A(JC) and 3 |[t A(.v) are well defined, but A(x) is 
not. 

After these apparently formal but actually physically important remarks, we are ready 
to work on the proof. I will let you give the general proof, but I will show you the way by 
working through some representative examples. 

Recall that the propagator for the hypothetical massive photon is iD^ v = i (k l ,k v /fi 2 — 
8/j.v)/(k 2 — /x 2 ). We can set the /u 2 in the denominator equal to zero without further ado 
and write the photon propagator effectively as iD= i{k ll k v /p 2 — g^J/A^.Thedangerous 
term is k^k v /ix 2 . We want to show that it goes away. 


A specific example 


First consider electron-electron scattering to order e 4 . Of the many diagrams, focus on the 
two in figure II.7.1.a. The Feynman amplitude is then 


u(p') ( y 


p+ k — m 




/S'— / — m 


Y u(p) 


k 2 


k„k v 


- s l 


(5) 


where I\ v is some factor whose detailed structure does not concern us. For the specific 
case shown in figure II.7.la we can of course write out P Ay explicitly if we want. Note the 
plus sign here from interchanging the two photons since photons obey Bose statistics. 




(a) 


Figure II.7.1 
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(b) 


Figure II.7.1 ( continued) 


Focus on the dangerous term. Contracting the u(p') (• • • )u{p ) factor in (5) with we 
have 

ilp,> f+ %■-;-„/ ) |6) 

The trick is to write the jt in the numerator of the first term as {p+ p — m) — (p — m), and 
in the numerator of the second term as (p' — m ) — ( p'— p — m). Using (p — m)u(p) = 0 
and u(p')(p' — m) = 0, we see that the expression in (6) vanishes. This proves the theorem 
in this simple example. But since the explicit form of T ^ v did not enter, the proof would 
have gone through even if figure II.7. la were replaced by the more general figure II.7. lb, 
where arbitrarily complicated processes could be going on under the shaded blob. 

Indeed we can generalize to figure II.7.1c. Apart from the photon carrying momentum 
k that we are focusing on, there are already n photons attached to the electron line. These 
n photons are just “spectators” in the proof in the same way that the photon carrying 
momentum k' in figure II.7.la never came into the proof that (6) vanishes. The photon 
we are focusing on can attach to the electron line in n + 1 different places. You can now 
extend the proof as an exercise. 


Photon landing on an internal line 

In the example we just considered, the photon line in question lands on an external electron 
line. The fact that the line is “capped at the two ends” by u(p') and u(p) is crucial in the 
proof. What if the photon line in question lands on an internal line? 

An example is shown in figure II.7.2, contributing to electron-electron scattering in 
order e 8 . The figure contains three distinct diagrams. The electron “on the left” emits 
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(C) 

Figure II.7.1 ( continued) 


three photons, which attach to an internal electron loop. The electron “on the right” 
emits a photon with momentum k, which can attach to the loop in three distinct ways. 

Since what we care about is whether the k p k p /p? piece in the photon propagator 
i(k p k p /iu 2 — g pp )/k 2 goes away or not, we can for our purposes replace that photon 
propagator by k fl . To save writing slightly, we define p 1 = p + q l and p 2 = P\ + q 2 (see the 
momentum labels in figure II.7.2): Let’s focus on the relevant part of the three diagrams, 
referring to them as A, B, and C. 

A= f-^rtr (y v - - - y a - - - y k - - - ft —3—) (7) 

J (2?r) 4 \ ft 2 + ft — m /h+ ft — m f>+ /' — m ft — m ) 


ft 2 -\- ft — in fti+ ft — m fti — m ft — m 


J (2n) A \ ft 2 + ft — m ft 2 — in ft 1 — m ft — m ) 

This looks like an unholy mess, but it really isn’t. We use the same trick we used before. 
In C write ft — ( ft 2 + ft — m) — (ft 2 — m), so that 


C=f [tr (y v - 1 - ') 

J (27r) 4 \ ft 2 — m fti — in ft — m) 

-tr (y v - 1 - y a - 1 - y l — 1 —'j 

V fti+ ft ~ m ft\ ~ m ft ~ m ). 

In B write ft — ( ft\+ ft — m) — ( ft\ — m), so that 


r d 4 p " 

J (2tt) 4 . 


tr(K v ------ y 1 —!—) 

ft 2 + ft — in fti — in ft — m 


— tr (V------/ — 

V ft 2+ ft - m P\+ ft - m ft — 


—) 
— m / _ 
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Figure II.7.2 
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Finally, in A write Jt — ( ft+ ]k — m) — ( ft — ni) 


A = 


/ 


d 4 p 

W 4 


tr y 


1 


-y 


Y 


ft 2 + A — m ft\+ A — >n ft — m 


tr (y v -I-v CT - - - y x - - - 

V fti+ A - m fti+ A - m ft+ A - m 


( 12 ) 


Now you see what is happening. When we add the three diagrams together terms cancel 
in pairs, leaving us with 


A+B+C= 


-I 


d 4 p 
On) 4 

-tr (V 


tr y 


-y 


Y 


1 


ft 2 — m fti — in ft — ni 


-Y 


Y 


ft 2 + A — ni fty\- A — ni ft+ A — m 


(13) 


If we shift (see exercise II.7.2) the dummy integration variable p —> p — k in the second 
term, we see that the two terms cancel. Indeed, the k^k p /p? piece in the photon propagator 
goes away and we can set p — 0. 

I will leave the general proof to you. We have done it for one particular process. Try it 
for some other process. You will see how it goes. 


Ward-Taka has hi identity 


Let’s summarize. Given any physical amplitude T^'"(k, ■ ■ ■) with external electrons on 
shell [this is jargon for saying that all necessary factors u(p) and u(p) are included in 
T ll "'(k, ■ ■ •)] describing a process with a photon carrying momentum k coming out of, or 
going into, a vertex labeled by the Lorentz index p, we have 


k[ l T lx "'(k, •••) = () 


(14) 


This is sometimes known as a Ward-Takahashi identity. 

The bottom line is that we can write i D pv = —ig^/k 2 for the photon propagator. Since 
we can discard the k p k v /p 2 term in the photon propagator i(k p k v /p 2 — g^ v )/k 2 we can 
also add in a k p k v /k 2 term with an arbitrary coefficient. Thus, for the photon propagator 
we can use 


lD nv~ k 2 


k k v 

a - o - g^v 


(15) 


where we can choose the number c_ to simplify our calculation as much as possible. 
Evidently, the choice of C amounts to a choice of gauge for the electromagnetic field. In 
particular, the choice £ = 1 is known as the Feynman gauge, and the choice ^ = 0 is known 
as the Landau gauge. If you find an especially nice choice, you can have a gauge named 
after you as well! For fairly simple calculations, it is often advisable to calculate with an 
arbitrary £. The fact that the end result must not depend on ^ provides a useful check on 
the arithmetic. 
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This completes the derivation of the Feynman rules for quantum electrodynamics: They 
are the same rules as those given in chapter II.5 for the massive vector boson theory except 
for the photon propagator given in (15). 

We have given here a diagrammatic proof of the gauge invariance of quantum 
electrodynamics. We will worry later (in chapter IV.7) about the possibility that the shift of 
integration momentum used in the proof may not be allowed in some cases. 


The longitudinal mode 

We now come back to the worry we had in chapter 1.5. Consider a massive spin 1 meson 
moving along the z—direction. The 3 polarization vectors are fixed by the condition k x e x = 
0 with k x = ( w, 0, 0, k) (recall chapter 1.5) and the normalization e x s A = — 1, so that ej 1 * = 
(0, 1, 0, 0), e| 2) = (0, 0, 1, 0), e[ 3) = (— k, 0, 0, Note that as /z -> 0, the longitudinal 
polarization vector e[ 3) becomes proportional to k x = (co, 0, 0, —k). The amplitude for 
emitting a meson with a longitudinal polarization in the process described by (14) is 
given by s^T x - = (—kT 0 '" + wT 2 '")/^ = (~kT°- + Jk 2 + n 2 T 3 -)/n ~ (~kT°- + 
(j k + 2 X )T 2 ')/fx (for/r k), namely —(k x T x "'/iJ,) + |/T 3 "' withk A — (k, 0,0, — k). Upon 
using (14) we see that the amplitude e[ 3) T A "' -> j^T 2 '" -> 0 as /x -> 0. 

The longitudinal mode of the photon does not exist because it decouples from all physical 
processes. 

Here is an apparent paradox. Mr. Boltzmann tells us that in thermal equilibrium each 
degree of freedom is associated with \ T. Thus, by measuring some thermal property (such 
as the specific heat) of a box of photon gas to an accuracy of 2/3 an experimentalist could 
tell if the photon is truly massless rather than have a mass of a zillionth of an electron volt. 

The resolution is of course that as the coupling of the longitudinal mode vanishes as 

—>• 0 the time it takes for the longitudinal mode to come to thermal equilibrium goes to 
infinity. Our crafty experimentalist would have to be very patient. 


Emission and absorption of photons 

According to chapter II.5, the amplitude for emitting or absorbing an external on-shell 
photon with momentum k and polarization a (a — 1, 2) is given by e ( ‘‘Hk)T 11 '"(k, • • •)• 
Thanks to (14), we are free to vary the polarization vector 

+ < 16 ) 

for arbitrary You should recognize (16) as the momentum space version of (3). As we 
will see in the next chapter, by a judicious choice of s^\k), we can simplify a given 
calculation considerably. In one choice, known as the “transverse gauge,” the 4-vectors 
s^Hk) — (0, sik)) for a — 1, 2 do not have time components. (For a photon moving in the 
z-direction, this is just the choice specified in the preceding section.) 
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Exercises 

11.7.1 Extend the proof to cover figure II.7.1c. [Hint: To get oriented, note that figure II.7.lb corresponds to 
n = 1.] 

11.7.2 You might have worried whether the shift of integration variable is allowed. Rationalizing the denomi¬ 
nators in the first integral 

f d A P , v 1 _ 1 ; 1 

/ 77-h tr (x -7- y - y > 

J {27iy p 2 — m P\ — m p — m 

in (13) and imagining doing the trace, you can convince yourself that this integral is only logarithmically 
divergent and hence that the shift is allowed. This issue will come up again in chapter IV.7 and we are 
anticipating a bit here. 




Photon-Electron Scattering and Crossing 


Photon scattering on an electron 

We now apply what we just learned to calculating the amplitude for the Compton scattering 
of a photon on an electron, namely, the process y{k) + e(p) —»■ y(k') + e(p'). First step: 
draw the Feynman diagrams, and notice that there are two, as indicated in figure II.8.1. 
The electron can either absorb the photon carrying momentum k first or emit the photon 
carrying momentum k! first. Think back to the spacetime stories we talked about in chapter 
1.7. The plot of our biopic here is boringly simple: the electron comes along, absorbs and 
then emits a photon, or emits and absorbs a photon, and then continues on its merry way. 
Because this is a quantum movie, the two alternate plots are shown superposed. 

So, apply the Feynman rules (chapter II.5) to get (just to make the writing a bit easier, 
we take the polarization vectors s and s' to be real) 

M = A(s', k s, k) + {s' o e, k' -k) (1) 

where 

A(s', k';s, k) = (—iCp / ) /--- tfu(p) 

P + k — m 

... n 2 ( 2 ) 

= 1 _ , u(p') /(p + k+m) tfu(p) 

Apk 

In either case, absorb first or emit first, the electron is penalized for not being real, by the 
factor of 1/ ((p + k ) 2 — m 2 ) = 1/(2 pk) in one case, and 1 /((p — k') 2 — m 2 ) — —l/(2pk r ) in 
the other. 

At this point, to obtain the differential cross section, you just have take a deep breath and 
calculate away. I will show you, however, that we could simplify the calculation considerably 
by a clever choice of polarization vectors and of the frame of reference. (The calculation is 
still a big mess, though!) For a change, we will be macho guys and not average and sum 
over the photon polarizations. 
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Figure II.8.1 


In any case, we have sk — 0 and e'k! = 0. Now choose the transverse gauge introduced 
in the preceding chapter, so that e and s' have zero time components. Then calculate in 
the lab frame. Since p = (m, 0,0,0), we have the additional relations 

sp = 0 (3) 

and 

e’p = 0 (4) 

Why is this a shrewd choice? Recall that p ty — lab — ftp. Thus, we could move p past 
p or p' at the rather small cost of flipping a sign. Notice in (1) that (p + Ip + m) pu(p) = 
p(rp ~ ft + m)u(p) — — pftu(p) (where we have used ek — 0.) Thus 

A(s ', k'; s, k) = ie 2 u(p') —^ u(p) (5) 

2 pk 

To obtain the differential cross section, we need |Ad| 2 . We will wimp out a bit and 
suppose, just as in chapter II.6, that the initial electron is unpolarized and the polarization 
of the final electron is not measured. Then averaging over initial polarization and summing 
over final polarization we have [applying 11.2.8] 

1 e * 

-EE|A(e', k'\ e, k )| 2 =----tr(/J' + m) / p lj({p + m) Ip p p (6) 

2 2{2m) 2 (2 P k) 2 ' ' 

In evaluating the trace, keep in mind that the trace of an odd number of gamma 
matrices vanishes. The term proportional to m 1 contains Ip Ip — k 2 = 0 and hence van¬ 
ishes. Weare left with trl/i' p' p !p p p p p') — 2kp tr (p' p' p !p p p') — —2 kp tr (p 1 p' p p Ip p') 
= 2kp tr(/f p' p p') — 8kp[2(ks') 2 + k'p]. 
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Work through the steps as indicated and the strategy should be clear. We anticommute 
judiciously to exploit the “zero relations” sp — 0, s'p — 0, sk = 0, and s'k' = 0 and the 
normalization conditions ft ft — s 1 = — 1 and ft! ft 1 = e' 2 = — 1 as much as possible. 

We obtain 


: £E|A(g', k';s, k)\ 2 = 


The other term 


2(2m) 2 (2pk) 


-Skp[2(ke') 2 + k' p] 


; £E|A(g, -k;s', k’)\ 2 = 


2(2m) 2 (2pk) 


-S(-k'p)[2(k's) 2 - kp] 


follows immediately by inspecting figure II.8.1 and interchanging (s' ** s, k! o- —k). 

Just as in chapter II.6, the interference term 

1 e * 

-T,Y,A(s,—k;e l ,k l )*A(s',k';s,k)= ---tr (ft' + m)ft ftft(ft + m)lft ft ft (8) 

2 2(2m) 2 (2pk)(-2pk’) v ' 

is the most tedious to evaluate. Call the trace T. Clearly, it would be best to eliminate 
p' — p + k — k' , since we could “do more” with ft than ft'. Divide and conquer: write T — 
P + Qi+ 0.2- First, massage P = Xx(ft + m) ft ft ft(ft + m) ft' / / = i« 2 tr + 

tr / / 1/ ft If / In the second term, we could sail the first p past the and / (ah, so 
nice to work in the rest frame for this problem!) to find the combination ft !/ f> = 2 kp ft — 
m 2 1ft. The in 2 term gives a contribution that cancels the first term in P, leaving us with 
P = 2kp \x tf sf ft ljft qf (ft — 2kp tr ft lft' ft'(2e's — ft' ft) ft — S(kp)(k'p)[2(ss') 2 — 1], Similarly, 
Q\ = tr ft ft' ft ft ft ft' ft' ft — —2ke' tr ft ft ft' ft' = —8 (e'k) 2 k'p and Q 2 — — tr ft' ft' ft ft ft ft' ft! ft 
= &(ek') 2 kp'. 

Putting it all together and writing kp' = k'p — moft and k!p' = kp = mco, we find 


1 9 c 4 

-EE|.M| 2 =-- 

2 (2m) 2 


CO CO A , /n 7 

-1-- + 4 (se ) 2 - 2 

CO CO 


We calculate the differential cross section as in chapter 11.6 with some minor differences 
since we are in the lab frame, obtaining 


(27r) 2 2£o 




As described in the appendix to chapter II.6, we could use (1.8.14) and write f -gA (•••) = 
f d 4 p'0(p'°)S(p' 2 — m 2 )(- ■ •)■ Doing the integral over d 4 p r to knock out the 4-dimensional 
delta function, we are left with a delta function enforcing the mass shell condition 0 = 
p' 2 — m 2 — (p + k — k') 2 — m 2 — 2p(k — k') — 2kk! — 2m(co — a>') — 2a>a>'(l — cos 0), with 
0 the scattering angle of the photon. Thus, the frequency of the outgoing photon and of 
the incoming photon are related by 


1 + ^ sin 2 


giving the frequency shift that won Arthur Compton the Nobel Prize. You realize of course 
that this formula, though profound at the time, is “merely” relativistic kinematics and has 
nothing to do with quantum field theory per se. 
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Figure II.8.2 


What quantum field theory gives us is the Klein-Nishina formula (1929) 


— = 

d£l (2m) 2 An 01 


— + — + 4(ee') 2 - 2 
01 01 


You ought to be impressed by the year. 


( 12 ) 


Electron-positron annihilation 

Here and in chapter II.6 we calculated the cross sections for some interesting scattering 
processes. At the end of that chapter we marvelled at the magic of theoretical physics. 
Even more magical is the annihilation of matter and antimatter, a process that occurs 
only in relativistic quantum field theory. Specifically, an electron and a positron meet and 
annihilate each other, giving rise to two photons: e~(p 1 ) + e + (p 2 ) —>■ y(e 1( fa) + y(e 2 , k 2 ). 
(Annihilating into one physical, that is, on-shell, photon is kinematically impossible.) 
This process, often featured in science fiction, is unknown in nonrelativistic quantum 
mechanics. Without quantum field theory, you would be clueless on how to calculate, say, 
the angular distribution of the outgoing photons. 

But having come this far, you simply apply the Feynman rules to the diagrams in fig¬ 
ure II.8.2, which describe the process to order e 2 . We find the amplitude A4 = 
A(fa, eq k 2 , s 2 ) + A(k 2 , s 2 ; fa, (Bose statistics for the two photons!), where 

Mfa, ep k 2 , e 2 ) = (ie)(-ie)v(p 2 ) fa— --- fau(Pi) (13) 

fa- m 

Students of quantum field theory are sometimes confused that while the incoming electron 
goes with the spinor it, the incoming positron goes with v, and not with v. You could check 
this by inspecting the hermitean conjugate of (11.2.10). Even simpler, note that v(■ ■ -)u 
[with (• • •) a bunch of gamma matrices contracted with various momenta] transforms 
correctly under the Lorentz group, while v(- ■ -)u does not (and does not even make sense, 
since they are both column spinors.) Or note that the annihilation operator d for the 
positron is associated with v, not v. 
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I want to emphasize that the positron carries momentum p 2 — (+-J P 2 + ,,j2 » Pi) on its 

way to that fatal rendezvous with the electron. Its energy p ® = +^jp\ + m 2 is manifestly 
positive. Nor is any physical particle traveling backward in time. The honest experimental¬ 
ist who arranged for the positron to be produced wouldn’t have it otherwise. Remember 
my rant at the end of chapter 11.2? 

In figure II.8.2a I have labeled the various lines with arrows indicating momentum 
flow. The external particles are physical and there would have been serious legal issues 
if their energies were not positive. There is no such restriction on the virtual particle 
being exchanged, though. Which way we draw the arrow on the virtual particle is purely 
up to us. We could reverse the arrow, and then the momentum label would become 
p 2 — k 2 = k\ — p\. the time component of this “composite” 4-vector can be either positive 
or negative. 

To make the point totally clear, we could also label the lines by dotted arrows showing the 
flow of (electron) charge. Indeed, on the positron line, momentum and (electron) charge 
flow in opposite directions. 


Crossing 


I now invite you to discover something interesting by staring at the expression in (13) for 
a while. 

Got it? Does it remind you of some other amplitude? 

No? How about looking at the amplitude for Compton scattering in (2)? 

Notice that the two amplitudes could be turned into each other (up to an irrelevant sign) 
by the exchange 

p -o- pi, k — k\, p' — p 2 , k! -o- k 2 , e -o- e 1( s' -o- e 2 , u(p) -o- u(p{), u(p') -o- v(p 2 ) (14) 


This is known as crossing. Diagrammatically, we are effectively turning the diagrams in 
figures II.8.1 and II.8.2 into each other by 90 rotations. Crossing expresses in precise terms 
what people who like to mumble something about negative energy traveling backward in 
time have in mind. 

Once again, it is advantageous to work in the electron rest frame and in the transverse 
gauge, so that we have e 2 pi — 0 and s 2 pi — 0 as well as s 2 k 1 — 0 and s 2 k 2 — 0. Averaging 
over the electron and positron polarizations we obtain 


da 

dQ. 


(A) 

— + — - 4 (W ) 2 + 2 

\\P\J 

.CO CO 


(15) 


with a>i — m(m + E)/(m + E — p cos 9), to 2 — (E — m — p cos 0)a>i/m, and p — \p\ and 
E the positron momentum and energy, respectively. 
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time 



Figure II.8.3 


Special relativity and quantum mechanics require antimatter 

The formalism in chapter 11.2 makes it totally clear that antimatter is obligatory. For us to 
be able to add the operators b and d f in (II.2.10) they must carry the same electric charge, 
and thus b and d carry opposite charge. No room for argument there. Still, it would be 
comforting to have a physical argument that special relativity and quantum mechanics 
mandate antimatter. 

Compton scattering offers a context for constructing a nice heuristic argument. Think 
of the process in spacetime. We have redrawn figure II.8.la in figure II.8.3: the electron is 
hit by the photon at the point x, propagates to the point y, and emits a photon. We have 
assumed implicitly that (y° — x° ) > 0, since we don’t know what propagating backward 
in time means. (If the reader knows how to build a time machine, let me know.) But 
special relativity tells us that another observer moving by (along the 1-direction say) would 
see the time difference (y /0 — x /0 ) = cosh<p(v° — x°) — sinhi/dy 1 — x 1 ), which could be 
negative for large enough boost parameter q>, provided that (y 1 — x 1 ) > (y° — x°), that 
is, if the separation between the two spacetime points x and y were spacelike. Then this 
observer would see the field disturbance propagating from y to x. Since we see negative 
electric charge propagating from x to y, the other observer must see positive electric 
charge propagating from y to x. Without special relativity, as in nonrelativistic quantum 
mechanics, we simply write down the Schodinger equation for the electron and that is 
that. Special relativity allows different observers to see different time ordering and hence 
opposite charges flowing toward the future. 

Exercises 

11.8.1 Show that averaging and summing over photon polarizations amounts to replacing the square bracket 
in (9) by 2[^- + ^ — sin 2 6\. [Hint: We are working in the transverse gauge.] 

M.8.2 Repeat the calculation of Compton scattering for circularly polarized photons. 
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Cutting Off Our Ignorance 


Who is afraid of infinities? Not I, I just cut them off. 

—Anonymous 


An apparent sleight of hand 

The pioneers of quantum field theory were enormously puzzled by the divergent integrals 
that they often encountered in their calculations, and they spent much of the 1930s 
and 1940s struggling with these infinities. Many leading lights of the day, driven to 
desperation, advocated abandoning quantum field theory altogether. Eventually, a so-called 
renormalization procedure was developed whereby the infinities were argued away and 
finite physical results were obtained. But for many years, well into the late 1960s and even 
the 1970s many physicists looked upon renormalization theory suspiciously as a sleight of 
hand. Jokes circulated that in quantum field theory infinity is equal to zero and that under 
the rug in a field theorist’s office had been swept many infinities. 

Eventually, starting in the 1970s a better understanding of quantum field theory was 
developed through the efforts of Ken Wilson and many others. Field theorists gradually 
came to realize that there is no problem of divergences in quantum field theory at all. We 
now understand quantum field theory as an effective low energy theory in a sense I will 
explain briefly here and in more detail in chapter VI11.3. 


Field theory blowing up 


We have to see an infinity before we can talk about how to deal with infinities. Well, we 
saw one in chapter 1.7. Recall that the order X 2 correction (1.7.23) to the meson-meson 
scattering amplitude diverges. With K = k 1 + k 2 , we have 

d 4 k 1 1 


M = j(— iX) 2 i 2 J 


(2 7i) 4 k 2 — m 2 + is (K — k) 2 — m 2 + is 


( 1 ) 


As I remarked back in chapter 1.7, even without doing any calculations we can see the 
problem that confounded the pioneers of quantum field theory. The integrand goes as 1/ k 4 
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for large k and thus the integral diverges logarithmically as f d 4 k / k 4 . (The ordinary integral 
f°° dr r" diverges linearly for n — 0, quadratically for n = 1, and so on, and f°° dr/r 
diverges logarithmically.) Since this divergence is associated with large values of k it is 
known as an ultraviolet divergence. 

To see how to deal with this apparent infinity, we have to distinguish between two con¬ 
ceptually separate issues, associated with the terrible names “regularization” and “renor¬ 
malization” for historical reasons. 


Parametrization of ignorance 

Suppose we are studying quantum electrodynamics instead of this artificial <p 4 theory. It 
would be utterly unreasonable to insist that the theory of an electron interacting with a 
photon would hold to arbitrarily high energies. At the very least, with increasingly higher 
energies other particles come in, and eventually electrodynamics becomes merely part of 
a larger electrowealc theory. Indeed, these days it is thought that as we go to higher and 
higher energies the whole edifice of quantum field theory will ultimately turn out to be an 
approximation to a theory whose identity we don’t yet know, but probably a string theory 
according to some physicists. 

The modern view is that quantum field theory should be regarded as an effective low 
energy theory, valid up to some energy (or momentum in a Lorentz invariant theory) scale 
A. We can imagine living in a universe described by our toy q> 4 theory. As physicists in 
this universe explore physics to higher and higher momentum scales they will eventually 
discover that their universe is a mattress constructed out of mass points and springs. The 
scale A is roughly the inverse of the lattice spacing. 

When I teach quantum field theory, I like to write “Ignorance is no shame” on the 
blackboard for emphasis when I get to this point. Every physical theory should have a 
domain of validity beyond which we are ignorant of the physics. Indeed were this not true 
physics would not have been able to progress. It is a good thing that Feynman, Schwinger, 
Tomonaga, and others who developed quantum electrodynamics did not have to know 
about the charm quark for example. 

I emphasize that A should be thought of as physical, parametrizing our threshold of 
ignorance, and not as a mathematical construct. 1 Indeed, physically sensible quantum 
field theories should all come with an implicit A. If anyone tries to sell you a field theory 
claiming that it holds up to arbitrarily high energies, you should check to see if he sold 
used cars for a living. (As I wrote this, a colleague who is an editor of Physical Review Letters 
told me that he worked as a garbage collector during high school vacations, adding jokingly 
that this experience prepared him well for his present position.) 

1 We saw a particularly vivid example of this in chapter 1.8. When we define a conducting plate as a surface 
on which a tangential electric field vanishes, we are ignorant of the physics of the electrons rushing about to 
counter any such imposed field. At extremely high frequencies, the electrons can’t rush about fast enough and 
new physics comes in, namely that high frequency modes do not see the plates. In calculating the Casimir force 
we parametrize our ignorance with a ~ A -1 . 
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Thus, in evaluating (1) we should integrate only up to A, known as a cutoff. We 
literally cut off the momentum integration (fig. III.1.1). 2 The integral is said to have been 
“regularized.” 

Since my philosophy in this book is to emphasize the conceptual rather than the 
computational, I will not actually do the integral but merely note that it is equal to 
2/C log(A 2 /K 2 ) where C is some numerical constant that you can compute if you want 
(see appendix 1 to this chapter). For the sake of simplicity I also assumed that m 2 << K 2 
so that we could neglect m 2 in the integrand. It is convenient to use the kinematic variables 
s = K 2 — (k 1 + k 2 ) 2 , t = (ki~ k 3 ) 2 , and u = (kj — k 4 ) 2 introduced in chapter II.6. (Writing 
out the kj’s explicitly in the center-of-mass frame, you see that s, t, and u are related to 
rather mundane quantities such as the center-of-mass energy and the scattering angle.) 
After all this, the meson-meson scattering amplitude reads 


M — —ik + /CX 2 [log ( —+ log ( —+ log ( — Y + 0(k 3 ) 
\ S / \ t ) V u ) 


( 2 ) 


This much is easy enough to understand. After regularization, we speak of cutoff- 
dependent quantities instead of divergent quantities, and A4 depends logarithmically on 
the cutoff. 


2 A. Zee, Einstein’s Universe, p. 204. Cartooning schools apparently teach that physicists in general, and 
quantum field theorists in particular, all wear lab coats. 
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What is actually measured 

Now that we have dealt with regularization, let us turn to renormalization, a terrible word 
because it somehow implies we are doing normalization again when in fact we haven’t 
yet. 

The key here is to imagine what we would tell an experimentalist about to measure 
meson-meson scattering. We tell her (or him if you insist) that we need a cutoff A and she 
is not bothered at all; to an experimentalist it makes perfect sense that any given theory 
has a finite domain of validity. 

Our calculation is supposed to tell her how the scattering will depend on the center-of- 
mass energy and the scattering angle. So we show her the expression in (2). She points to 
X and exclaims, “What in the world is that?” 

We answer, “The coupling constant,” but she says, “What do you mean, coupling 
constant, it’s just a Greek letter!” 

A confused student, Confusio, who has been listening in, pipes up, “Why the fuss? I 
have been studying physics for years and years, and the teachers have shown us lots of 
equations with Latin and Greek letters, for example, Hooke’s law F — —kx, and nobody 
gets upset about k being just a Latin letter.” 

Smart Experimentalist: “But that is because if you give me a spring I can go out and 
measure k. That’s the whole point! Mr. Egghead Theorist here has to tell me how to 
measure this X.” 

Woah, that is a darn smart experimentalist. We now have to think more carefully what 
a coupling constant really means. Think about a, the coupling constant of quantum elec¬ 
trodynamics. Well, it is the coefficient of 1 /r in Coulomb’s law. Fine, Monsieur Coulomb 
measured it using metallic balls or something. But a modern experimentalist could just 
as well have measured a by scattering an electron at such and such an energy and at 
such and such a scattering angle off a proton. We explain all this to our experimentalist 
friend. 

SE, nodding, agrees: “Oh yes, recently my colleague so and so measured the coupling for 
meson-meson interaction by scattering one meson off another at such and such an energy 
and at such and such a scattering angle, which correspond to your variables s, t, and u 
having values Sq, t 0 , and uq. But what does the coupling constant my colleague measured, 
let us call it X P , with the subscript meaning “physical,” have to do with your theoretical X, 
which, as far as I am concerned, is just a Greek letter in something you call a Lagrangian!” 

Confusio, “Hey, if she’s going to worry about small lambda, I am going to worry about 
big lambda. How do I know how big the domain of validity is?” 

SE: “Confusio, you are not as dumb as you look! Mr. Egghead Theorist, if I use your 
formula (2), what is the precise value of A that I am supposed to plug in? Does it depend 
on your mood, Mr. Theorist? If you wake up feeling optimistic, do you use 2 A instead of 
A? And if your girl friend left you, you use \ A?” 

We assert, “Ha, we know the answer to that one. Look at (2): M is supposed to be an 
actual scattering amplitude and should not depend on A. If someone wants to change A 
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we just shift X in such a way so that A4 does not change. In fact, a couple lines of arithmetic 
will show you precisely what dX/d A has to be (see exercise III.1.3).” 

SE: “Okay, so X is secretly a function of A. Your notation is lousy.” 

We admit, “Exactly, this bad notation has confused generations of physicists.” 

S E: “I am still waiting to hear how the X P my experimental colleague measured is related 
to your X.” 

We say, “Aha, that’s easy. Just look at (2), which is repeated here for clarity and for your 
reading convenience: 


M = —iX + iCX 2 


A" 


A" 


log (-l + log ( T l+ log (- 


A" 


+ 0(V) 


(3) 


According to our theory, X P is given by 


—iXp — — iX “I - iCX 


A‘ 


A" 


log ( — ) + log ( — ) + log ( — ) I + 0(X ) 
s o 


(—X 

\« 0 /- 


(4) 


To show you clearly what is involved, let us denote the sum of logarithms in the square 
bracket in (3) and in (4) by L and by L 0 , respectively, so that we can write (3) and (4) more 
compactly as 


M = -iX + iCX 2 L + 0(X 3 ) 


(5) 


and 


-iX P = —iX + iCX 2 L 0 + 0(X 3 ) (6) 

That is how X P and X are related.” 

SE: “If you give me the scattering amplitude expressed in terms of the physical coupling 
X P then it’s of use to me, but it’s not of use in terms of X. I understand what X P is, but 
not X.” 

We answer: “Fine, it just takes two lines of algebra to eliminate X in favor of X P . Big deal. 
Solving (6) for X gives 

-iX = -iX P - iCX 2 L 0 + 0{X 3 ) = -iX P - iCX 2 p L 0 + 0(X 3 p ) (7) 

The second equality is allowed to the order of approximation indicated. Now plug this 
into (5) 

M = —iX + iCX 2 L + 0(X 3 ) = -iX P - iCX 2 p L 0 + iCX 2 p L + 0(X 3 p ) (8) 

Please check that all manipulations are legitimate up to the order of approximation indi¬ 
cated.” 


The “miracle” 


Lo and behold! The miracle of renormalization! 

Now in the scattering amplitude A4 we have the combination L — L 0 — [logl^o/i) + 
log(r 0 / 1 ) + log(M 0 /tf)]. In other words, the scattering amplitude comes out as 


tog(2!) + |og (^) + log (2) 


+ 0(X 3 p ) 


Af — — i X P -j- i CX p 


(9) 



166 | III. Renormalization and Gauge Invariance 


We announce triumphantly to our experimentalist friend that when the scattering 
amplitude is expressed in terms of the physical coupling constant X P as she had wanted, 
the cutoff A disappear completely! 


The answer should always be in terms of physically measurable quantities 

The lesson here is that we should express physical quantities not in terms of “fictitious” 
theoretical quantities such as X, but in terms of physically measurable quantities such as 
X P . 

By the way, in the literature, X P is often denoted by X R and for historical reasons called the 
“renormalized coupling constant.” I think that the physics of “renormalization” is much 
clearer with the alternative term “physical coupling constant,” hence the subscript P. We 
never did have a “normalized coupling constant.” 

Suddenly Confusio pipes up again; we have almost forgotten him! 

Confusio: “You started out with an M in (2) with two unphysical quantities X and A, 
and their “unphysicalness” sort of cancel each other out.” 

SE: “Yeah, it is reminiscent of what distinguishes the good theorists from the bad ones. 
The good ones always make an even number of sign errors, and the bad ones always make 
an odd number.” 


Integrating over only the slow modes 

In the path integral formulation, the scattering amplitude A4 discussed here is obtained 
by evaluating the integral (chapter 1.7) 

J Dtp <p(x 1 )ip(x 2 )ip(x 3 )(p(x 4 )e $ d x!j[(8,>) ~ m ‘ p] ~T\' e] 

The regularization used here corresponds roughly to restricting ourselves, in the integral 
f Dip, to integrating over only those field configurations ip(x) whose Fourier transform 
ip(k) vanishes for k 2; A. In other words, the fields corresponding to the internal lines in 
the Feynman diagrams in fig. (1.7.10) are not allowed to fluctuate too energetically. We will 
come back to this path integral formulation later when we discuss the renormalization 
group. 


Alternative lifestyles 

I might also mention that there are a number of alternative ways of regularizing Feyn¬ 
man diagrams, each with advantages and disadvantages that make them suitable for some 
calculations but not others. The regularization used here, known as Pauli-Villars, has the 
advantage of being physically transparent. Another often used regularization is known as 
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dimensional regularization. We pretend that we are calculating in d-dimensional space- 
time. After the Feynman integral has been beaten down to a suitable form, we do an analytic 
continuation in cl and set d — 4 at the end of the day. The cutoff dependences of various 
integrals now show up as poles as we let d -> 4. Just as the cutoff A disappears when 
the scattering amplitude is expressed in terms of the physical coupling constant X P , in di¬ 
mensional regularization the scattering amplitude expressed in terms of X P is free of poles. 
While dimensional regularization proves to be useful in certain contexts as I will note in a 
later chapter, it is considerably more abstract and formal than Pauli-Villars regularization. 
Each to his or her own taste when it comes to regularizing. 

Since the emphasis in this book is on the conceptual rather than the computational, I 
won’t discuss other regularization schemes but will merely sketch how Pauli-Villars and 
dimensional regularizations work in two appendixes to this chapter. 


Appendix 1: Pauli-Villars regularization 


The important message of this chapter is the conceptual point that when physical amplitudes are expressed in 
term of physical coupling constants the cutoff dependence disappears. The actual calculation of the Feynman 
integral is unimportant. But I will show you how to do the integral just in case you would like to do Feynman 
integrals for a living. 

Let us start with the convergent integral 


f d 4 k 1 
J (2jt) 4 (k 1 — c 2 + is) 2 


32n 2 c 2 


( 10 ) 


The dependence on c 2 follows from dimensional analysis. The overall factor is calculated in appendix D. 
Applying the identity (D.15) 


/' 


— I da 

xy 


0 [ax + (1 - a)y] 2 


( 11 ) 


to (1) we have 


M=-{-iX) 2 i 2 




with 


D = [a(K — k) 2 + (1 — a)k 2 — m 2 + isf = [{k — aK) 2 + a(l — a)K 2 — m 2 + is] 2 

Shift the integration variable k —> k + aK and we meet the integral f[d 4 k/(2 tz) 4 ] [1 /(k 2 — c 2 + is) 2 ], where 
c 2 = m 2 — ad — a)k 2 . Pauli-Villars proposed replacing it by 


r d 4 k 

1 

1 

J (2n) 4 

_ ( k 2 — c 2 + is) 2 

(k 2 — A 2 + is) 2 _ 


with A 2 3> c 2 . For k much smaller than A the added second term in the integrand is of order A 4 and is negligible 
compared to the first term since A is much larger than c. For k much larger than A, the two terms almost cancel 
and the integrand vanishes rapidly with increasing k, effectively cutting off the integral. 

Upon differentiating (12) with respect to c 2 and using (10) we deduce that (12) must be equal to 
(i/Ibn 2 ) log(A 2 /c 2 ). Thus, the integral 


r A d 4 k i i . /a 2 \ 

J (2: r) 4 (k 2 — c 2 + is) 2 16n 2 ^ \ c 2 / 

is indeed logarithmically dependent on the cutoff, as anticipated in the text. 


( 13 ) 
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For what it is worth, we obtain 

M=— f 1 da log ( — -—- - -) 

32n 2 Jo \m 2 — a( 1 — a)K 2 — is ) 


( 14 ) 


Appendix 2: Dimensional regularization 


The basic idea behind dimensional regularization is very simple. When we reach 
/ = J[d 4 k/(2n) 4 ][l/(k 2 — c 2 + is) 2 ] we rotate to Euclidean space and generalize to d dimensions (see appen¬ 
dix D): 


I (d) = i 


f ^ 1 

r 2n d ' 2 1 

1 f 

J (2jr) d ( k 2 + c 2 ) 2 

_T(d/2)_ 

(2n) d Jo 


dk k 1 


d-1 


(, k 2 + c 2 ) 2 


As I said, I don’t want to get bogged down in computation in this book, but we’ve got to do what we’ve got to do. 
Changing the integration variable by setting k 2 + c 2 = c 2 /x we find 


f 

Jo 


dk k‘ 


d -1 


(, k 2 + c 2 ) 2 


_ lr d ~ A 

- 


f'dxi l-x^'x 1 -* 2 , 

Jo 


which we are supposed to recognize as the integral representation of the beta function. After the dust settles, we 
obtain 


f 4 k 1 - f 1 r 

J (2 n) d ( k 2 + c 2 ) 2 (An) d l 2 



(15) 


As d —> 4, the right-hand side becomes 

2 , 

-log c 2 + log(47r) — y + 0(d — 4) 

4 — d 

where y = 0.577 ■ ■ ■ denotes the Euler-Mascheroni constant. 

Comparing with (13) we see that log A 2 in Pauli-Villars regularization has been effectively replaced by the pole 
2/(4 — d). As noted in the text, when physical quantities are expressed in terms of physical coupling constants, 
all such poles cancel. 



Exercises 


11 l.i.i Work through the manipulations leading to (9) without referring to the text. 

111.1 .2 Regard (1) as an analytic function of K 2 . Show that it has a cut extending from 4m 2 to infinity. [Hint: If 
you can’t extract this result directly from (1) look at (14). An extensive discussion of this exercise will be 
given in chapter III. 8 .] 

11 1 . 1.3 Change A to e £ A. Show that for A4 not to change, to the order indicated A must change by SX = 
6 sCX 2 + 0(A 3 ), that is, 



Renormalizable versus Nonrenormalizable 


111.2 


Old view versus new view 

We learned that if we were to write the meson meson scattering amplitude in terms 
of a physically measured coupling constant X P , the dependence on the cutoff A would 
disappear (at least to order X 2 p ). Were we lucky or what? 

Well, it turns out that there are quantum field theories in which this would happen and 
that there are quantum field theories in which this would not happen, which gives us a 
binary classification of quantum field theories. Again, for historical reasons, the former 
are known as “renormalizable theories” and are considered “nice.” The latter are known 
as “nonrenormalizable theories,” evoking fear and loathing in theoretical physicists. 

Actually, with the new view of field theories as effective low energy theories to some 
underlying theory, physicists now look upon nonrenormalizable theories in a much more 
sympathetic light than a generation ago. I hope to make all these remarks clear in this and 
a later chapter. 


High school dimensional analysis 

Let us begin with some high school dimensional analysis. In natural units in which h — 1 
and c = 1, length and time have the same dimension, the inverse of the dimension of mass 
(and of energy and momentum). Particle physicists tend to count dimension in terms of 
mass as they are used to thinking of energy scales. Condensed matter physicists, on the 
other hand, usually speak of length scales. Thus, a given field operator has (equal and) 
opposite dimensions in particle physics and in condensed matter physics. We will use the 
convention of the particle physicists. 

Since the action S = f cl 4 xC appears in the path integral as e' s , it is clearly dimension¬ 
less, thus implying that the Lagrangian (Lagrangian density, strictly speaking) £ has the 
same dimension as the 4th power of a mass. We will use the notation [£] = 4 to indicate 
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that C has dimension 4. In this notation [x] = —1 and [3] = 1. Consider the scalar field 
theory C — \[(d(p) 2 — m 2 <p 2 ] — Xcp 4 . For the term (dip) 2 to have dimension 4, we see that 
[<p\ = 1 (since 2(1 + [y>]) =4). This then implies that [A] = 0, that is, the coupling A is dimen¬ 
sionless. The rule is simply that for each term in C, the dimensions of the various pieces, 
including the coupling constant and mass, have to add up to 4 (thus, e.g., [A] + 4 [ip] — 4). 

How about the fermion field i//? Applying this rule to the Lagrangian C = xjriy^d^ xjr -\- 
■ ■ ■ we see that [xj/] — (Henceforth we will suppress the • • •; it is understood that we 
are looking at a piece of the Lagrangian. Furthermore, since we are doing dimensional 
analysis we will often suppress various irrelevant factors, such as numerical factors and 
the gamma matrices in the Fermi interaction that we will come to presently.) Looking at 
the coupling fcpxjfxlf we see that the Yukawa coupling / is dimensionless. In contrast, in 
the theory of the weak interaction with C = Gxj/ xj/^rx/r we see that the Fermi coupling G 
has dimension —2 (since —2 + 4(|) = 4; got that?). 

From the Maxwell Lagrangian — \F^ v F ,lv we see that [4 ;/ ] = 1 and hence A^ has the 
same dimension as 3^: The vector field has the same dimension as the scalar field. The 
electromagnetic coupling eA^xj/y^xj/ tells us that e is dimensionless, which we can also 
deduce from Coulomb’s law written in natural units V (r) =a/r, with the fine structure 
constant a — e 2 /Arc. 


Scattering amplitude blows up 

We are now ready for a heuristic argument regarding the nonrenormalizability of a theory. 
Consider Fermi’s theory of the weak interaction. Imagine calculating the amplitude A4 
for a four-fermion interaction, say neutrino-neutrino scattering at an energy much smaller 
than A. In lowest order, A4 ~ G. Let us try to write down the amplitude to the next order: 
M ~ G + G 2 (?), where we will try to guess what (?) is. Since all masses and energies 
are by definition small compared to the cutoff A, we can simply set them equal to zero. 
Since [G] = —2, by high school dimensional analysis the unknown factor (?) must have 
dimension +2. The only possibility for (?) is A 2 . Hence, the amplitude to the next order 
must have the form A4 ~ G + G 2 A 2 . We can also check this conclusion by looking at the 
Feynman diagram in figure III.2.1: Indeed it goes as G 2 f A d 4 p (l/p)(l/p) ~ G 2 A 2 . 

Without a cutoff on the theory, or equivalently with A = oo, theorists realized that the 
theory was sick: Infinity was the predicted value for a physical quantity. Fermi’s weak 
interaction theory was said to be nonrenormalizable. Furthermore, if we try to calculate to 
higher order, each power of G is accompanied by another factor of A 2 . 

In desperation, some theorists advocated abandoning quantum field theory altogether. 
Others expended an enormous amount of effort trying to “cure” weak interaction theory. 
For instance, one approach was to speculate that the series (with coefficients suppressed) 
A4 ~ G[1 + G A 2 + (GA 2 ) 2 + (GA 2 ) 4 + • • •] summed to G/(GA 2 ), where the unknown 
function / might have the property that / (oo) was finite. In hindsight, we now know that 
this is not a fruitful approach. 
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Instead, what happened was that toward the late 1960s S. Glashow, A. Salam, and S. 
Weinberg, building on the efforts of many others, succeeded in constructing an elec- 
trowealc theory unifying the electromagnetic and weak interactions, as I will discuss in 
chapter VII.2. Fermi’s weak interaction theory emerges within electroweak theory as a 
low energy effective theory. 


Fermi’s theory cried out 

In modern terms, we think of the cutoff A as really being there and we hear the cutoff 
dependence of the four-fermion interaction amplitude A1 ~ G + G 2 A 2 as the sound of the 
theory crying out that something dramatic has to happen at the energy scale A ~ (1/G) 2 . 
The second term in the perturbation series becomes comparable to the first, so at the very 
least perturbation theory fails. 

Here is another way of making the same point. Suppose that we don’t know anything 
about cutoff and all that. With G having mass dimension —2, just by high school dimen¬ 
sional analysis we see that the neutrino-neutrino scattering amplitude at center-of-mass 
energy E has to go as A4 ~ G + G 2 E 2 + • • •. When E reaches the scale ~ (1/G) 2 the am¬ 
plitude reaches order unity and some new physics must take over just because the cross 
section is going to violate the unitarity bound from basic quantum mechanics. (Remember 
phase shift and all that?) 

In fact, what that something is goes back to Yukawa, who at the same time that he 
suggested the meson theory for the nuclear forces also suggested that an intermediate 
vector boson could account for the Fermi theory of the weak interaction. (In the 1930s the 
distinction between the strong and the weak interactions was far from clear.) Schematically, 
consider a theory of a vector boson of mas s M coupled to a fermion field via a dimensionles s 
coupling constant g: 


C = *(«>% - m)f - \F^F^ V + M 2 A^ 


( 1 ) 
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Let’s calculate fermion-fermion scattering. The Feynman diagram in figure III.2.2 gen¬ 
erates an amplitude (— ig^iuy^u)^/(k 2 — M 2 + ie)](uy^u), which when the momentum 
transfer k is much less than M becomes i(g 2 /M 2 )(iiy l2 u)(uy^u). But this is just as if the 
fermions are interacting via a Fermi theory of the form G^y^^i^y^) with G — g 2 /M 2 . 

If we blithely calculate with the low energy effective theory G^y^^^y^), it cries 
out that it is going to fail. Yes sir indeed, at the energy scale (1/G)2 = M/g, the vector 
boson is produced. New physics appears. 

I find it sobering and extremely appealing that theories in physics have the ability to 
announce their own eventual failure and hence their domains of validity, in contrast to 
theories in some other areas of human thought. 


Einstein’s theory is now crying out 

The theory of gravity is also notoriously nonrenormalizable. Simply comparing Newton’s 
law V(r) = G N M X M 2 /r with Coulomb’s V(r) = ot/r we see that Newton’s gravitational 
constant G N has mass dimension —2. No more need be said. We come to the same 
morose conclusion that the theory of gravity, just like Fermi’s theory of weak interaction, 
is nonrenormalizable. To repeat the argument, if we calculate graviton-graviton scattering 
at energy E, we encounter the series ~ [1 + G N E 2 + ( G N E 2 ) 2 +•••]. 

Just as in our discussion of the Fermi theory, the nonrenormalizability of quantum grav¬ 
ity tells us that at the Planck energy scale (1 /G n )t- = M P i anc k ~ 10 19 w proton new physics 
must appear. Fermi’s theory cried out, and the new physics turned out to be the elec- 
troweak theory. Einstein’s theory is now crying out. Will the new physics turn out to be 
string theory? 1 


Exercise 


11 1 . 2.1 Consider the d-dimensional scalar field theory S = J d d x(\(d<p) 2 + \m 2 <p 2 + X<p 4 + ■ ■ ■ + X n tp n + ■••)• 
Show that [< <p\ = (d — 2)/2 and [X„] = n(2 — d)/2 + d. Note that ip is dimensionless for d = 2. 


1 f. Polchinski, String Theory. 




Counterterms and Physical Perturbation Theory 


Renormalizability 


The heuristic argument of the previous chapter indicates that theories whose coupling has 
negative mass dimension are nonrenormalizable. What about theories with dimensionless 
couplings, such as quantum electrodynamics and the <p 4 theory? As a matter of fact, both 
of these theories have been proved to be renormalizable. But it is much more difficult to 
prove that a theory is renormalizable than to prove that it is nonrenormalizable. Indeed, 
the proof that nonabelian gauge theory (about which more later) is renormalizable took 
the efforts of many eminent physicists, culminating in the work of’t Hooft, Veltman, B. 
Lee, Zinn-Justin, and many others. 

Consider again the simple y> 4 theory. First, a trivial remark: The physical coupling 
constant X P is a function of s 0 , t 0 , and m 0 [see (III.1.4)]. For theoretical purposes it is much 
less cumbersome to set s 0 , r 0 , and u 0 equal to p? and thus use, instead of (III.1.4), the 
simpler definition 


, /A 2 \ , 

—iX P — —iX + 3 iCX 2 log I — ) 4- 0(X 3 ) 
V M 2 / 


( 1 ) 


This is purely for theoretical convenience. 1 

We saw that to order X 2 the meson-meson scattering amplitude when expressed in terms 
of the physical coupling X P is independent of the cutoff A. How do we prove that this is true 
to all orders in A? Dimensional analysis only tells us that to any order in A. the dependence of 
the meson scattering amplitude on the cutoff must be a sum of terms going as [log(A//r)] / ’ 
with some power p. 

The meson-meson scattering amplitude is certainly not the only quantity that depends 
on the cutoff. Consider the inverse of they) propagator to order A 2 as shown in figure III.3.1. 


1 In fact, the kinematic point s 0 = f 0 = u 0 = /r 2 cannot be reached experimentally, but that’s of concern to 
theorists. 
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Figure III.3.1 


The Feynman diagram in figure III.3.la gives something like 
<- A r d 4 q 


—iX 


a 


(2n) 4 


q 2 — m 2 + is 


The precise value does not concern us; we merely note that it depends quadratically on the 
cutoff A but not on k 2 . The diagram in figure 111.3. lb involves a double integral 


I(k, m, A; A.) = (— iX) 


17 


d 4 p d 4 q 
(2 n) 4 (In) 4 
i 


p 2 — m 2 + is q 2 — m 2 + is (p + q + k ) 2 — m 2 + is 


( 2 ) 


Counting powers of p and q we see that the integral ~ / (d s P/P 6 ) and so I depends 
quadratically on the cutoff A. 

By Lorentz invariance I is a function of k 1 , which we can expand in a series D + Ek 2 + 
Fk 4 + ■ ■ ■ . The quantity D is just I with the external momentum k set equal to zero and 
so depends quadratically on the cutoff A. Next, we can obtain E by differentiating I with 
respect to k twice and then setting k equal to zero. This clearly decreases the powers of p and 
q in the integrand by 2 and so E depends only logarithmically on the cutoff A. Similarly, 
we can obtain F by differentiating I with respect to k four times and then setting k equal to 
zero. This decreases the powers of p and q in the integrand by 4 and thus F is given by an 
integral that goes as ~ / d & P/P 10 for large P. The integral is convergent and hence cutoff 
independent. We can clearly repeat the argument ad infinitum. Thus F and the terms in 
(• • •) are cutoff independent as the cutoff goes to infinity and we don’t have to worry about 
them. 

Putting it altogether, we have the inverse propagator k 2 — m 2 + a + bk 2 up to O (k 2 ) with 
a and b, respectively, quadratically and logarithmically cutoff dependent. The propagator 
is changed to 


k 2 — m 2 (1 + b)k 2 — ( m 2 — a ) 

The pole in k 2 is shifted to m 2 p = m 2 + 8m 2 = (. m 2 — a)( 1 + b)~ l , which we identify as 
the physical mass. This shift is known as mass renormalization. Physically, it is quite 
reasonable that quantum fluctuations will shift the mass. 
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What about the fact that the residue of the pole in the propagator is no longer 1 but 

u + ^r 1 ? 

To understand this shift in the residue, recall that we blithely normalized the field q> so 
that C — j (3 cp) 2 + ■ ■ ■ . That the coefficient of k 2 in the lowest order inverse propagator 
k 2 — m 2 is equal to 1 reflects the fact that the coefficient of j(d(p) 2 in C is equal to 1. 
There is certainly no guarantee that with higher order corrections included the coefficient 
of j (3 q>) 2 in an effective C will stay at 1. Indeed, we see that it is shifted to (1 + b). For 
historical reasons, this is known as “wave function renormalization” even though there is 
no wave function anywhere in sight. A more modern term would be field renormalization. 
(The word renormalization makes some sense in this case, as we did normalize the field 
without thinking too much about it.) 

Incidentally, it is much easier to say “logarithmic divergent” than to say “logarithmically 
dependent on the cutoff A,” so we will often slip into this more historical and less accurate 
jargon and use the word divergent. In (p A theory, the wave function renormalization and the 
coupling renormalization are logarithmically divergent, while the mass renormalization 
is quadratically divergent. 


Bare versus physical perturbation theory 

What we have been doing thus far is known as bare perturbation theory. We should have 
put the subscript 0 on what we have been calling <p, m, and X. The field <p 0 is known as 
the bare field, and m 0 and X 0 are known as the bare mass and bare coupling, respectively. 
I did not put on the subscript 0 way back in part I because I did not want to clutter up the 
notation before you, the student, even knew what a field was. 

Seen in this light, using bare perturbation theory seems like a really stupid thing to 
do, and it is. Shouldn’t we start out with a zeroth order theory already written in terms of 
the physical mass m P and physical coupling X P that experimentalists actually measure, 
and perturb around that theory? Yes, indeed, and this way of calculating is known as 
renormalized or dressed perturbation theory, or as I prefer to call it, physical perturbation 
theory. 

We write 

£ = -[(3 (p) 2 - m 2 p <p 2 ] - —+ A(d(p) 2 + B<p 2 + C(p A (4) 

2 4! 

(A word on notation: The pedantic would probably want to put a subscript P on the field 
q>, but let us clutter up the notation as little as possible.) Physical perturbation theory 
works as follows. The Feynman rules are as before, but with the crucial difference that 
for the coupling we use X P and for the propagator we write i/(k 2 — m 2 p + is) with the 
physical mass already in place. The last three terms in (4) are known as counterterms. 
The coefficients A , B , and C are determined iteratively (see later) as we go to higher and 
higher order in perturbation theory. They are represented as crosses in Feynman diagrams, 
as indicated in figure III.3.2, with the corresponding Feynman rules. All momentum 
integrals are cut off. 
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4!/C 


Figure III.3.2 


Let me now explain how A, B, and C are determined iteratively. Suppose we have 
determined them to order Xp. Call their values to this order A N , B N and C#. Draw all the 
diagrams that appear in order kp +1 . We determine A n+1i B N+1 and C N+ i by requiring 
that the propagator calculated to the order has a pole at m P with a residue equal to 
1, and that the meson-meson scattering amplitude evaluated at some specified values of 
the kinematic variables has the value —iX P . In other words, the counterterms are fixed 
by the condition that m P and X P are what we say they are. Of course. A, B, and C will 
be cutoff dependent. Note that there are precisely three conditions to determine the three 
unknowns A ff+li B N+l and C N+X . 

Explained in this way, you can see that it is almost obvious that physical perturbation 
theory works, that is, it works in the sense that all the physical quantities that we calculate 
will be cutoff independent. Imagine, for example, that you labor long and hard to calculate 
the meson-meson scattering amplitude to order X 1 ^. It would contain some cutoff depen¬ 
dent and some cutoff independent terms. Then you simply add a contribution given by 
C 1 7 and adjust C 17 to cancel the cutoff dependent terms. 

But ah, you start to worry. You say, “What if I calculate the amplitude for two mesons to 
go into four mesons, that is, diagrams with six external legs? If I get a cutoff dependent 
answer, then I am up the creek, as there is no counterterm of the form Dip 6 in (4) to soak 
up the cutoff dependence.” Very astute of you, but this worry is covered by the following 
power counting theorem. 


Degree of divergence 

Consider a diagram with B E external ip lines. First, a definition: A diagram is said to have 
a superficial degree of divergence D if it diverges as A°. (A logarithmic divergence log A 
counts as D = 0.) The theorem says that D is given by 

D = 4 - B e 


(5) 
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Figure III.3.3 


I will give a proof later, but I will first illustrate what (5) means. For the inverse propagator, 
which has B E — 2, we are told that D = 2. Indeed, we encountered a quadratic divergence. 
For the meson-meson scattering amplitude, B E — 4, and so D — 0, and indeed, it is 
logarithmically divergent. 

According to the theorem, if you calculate a diagram with six external legs (what is 
technically sometimes known as the six-point function), B E — 6 and D — —2. The theorem 
says that your diagram is convergent or cutoff independent (i.e., the cutoff dependence 
disappears as A~ 2 ). You didn’t have to worry. You should draw a few diagrams to check 
this point. Diagrams with more external legs are even more convergent. 

The proof of the theorem follows from simple power counting. In addition to B E and 
D, let us define B l as the number of internal lines, V as the number of vertices, and L 
as the number of loops. (It is helpful to focus on a specific diagram such as the one in 
figure III.3.3 with B E — 6, D — —2, Bj = 5, V = 4, and L = 2.) 

The number of loops is just the number of f [d 4 k/(2jt) A ] we have to do. Each internal 
line carries with it a momentum to be integrated over, so we seem to have B { integrals to 
do. But the actual number of integrals to be done is of course decreased by the momentum 
conservation delta functions associated with the vertices, one to each vertex. There are thus 
V delta functions, but one of them is associated with overall momentum conservation of 
the entire diagram. Thus, the number of loops is 

L = B I -(V - 1) (6) 

[If you have trouble following this argument, you should work things out for the diagram 
in figure 111.3.3 for which this equation reads 2 = 5 — (4 — 1).] 

For each vertex there are four lines coming out (or going in, depending on how you look 
at it). Each external line comes out of (or goes into) one vertex. Each internal line connects 
two vertices. Thus, 

4V = B e + 2B, (7) 

(For figure III.3.3 this reads 4 ■ 4 = 6 + 2 ■ 5.) 
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Finally, for each loop there is a / d A k while for each internal line there is a 
i /( k 1 — m 2 + is), bringing the powers of momentum down by 2. Hence, 

D = 4L - 2 B, (8) 

(For figure III.3.3 this reads — 2 = 4 • 2 — 2 • 5.) 

Putting (6), (7), and (8) together, we obtain the theorem (5). 2 As you can plainly see, this 
formalizes the power counting we have been doing all along. 


Degree of divergence with fermions 

To test if you understood the reasoning leading to (5), consider the Yukawa theory we met 
in (II.5.19). (We suppress the counterterms for typographical clarity.) 

C = - m P )ijr + \[{d<p) 2 - fJ. 2 p (p 2 ] - X P <p A + f P <pf ty (9) 

Now we have to count Fj and F E , the number of internal and external fermion lines, 
respectively, and keep track of V f and V- A , the number of vertices with the coupling / and 
X, respectively. We have five equations altogether. For instance, (7) splits into two equations 
because we now have to count fermion lines as well as boson lines. For example, we now 
have 

V f + 4V ; = B e + 2B, (10) 

We (that is, you) finally obtain 

D = A-B e -\F e (11) 

So the divergent amplitudes, that is, those classes of diagrams with D > 0, have ( B E , F E ) — 
(0, 2), (2, 0), (1, 2), and (4, 0). We see that these correspond to precisely the six terms in 
the Lagrangian (9), and thus we need six counterterms. 

Note that this counting of superficial powers of divergence shows that all terms with 
mass dimension < 4 are generated. For example, suppose that in writing down the La¬ 
grangian (9) we forgot to include the X P cp A term. The theory would demand that we include 
this term: We have to introduce it as a counterterm [the term with (B E , F E ) = (4, 0) in the 
list above]. 

A common feature of (5) and (11) is that they both depend only on the number of external 
lines and not on the number of vertices V. Thus, for a given number of external lines, no 
matter to what order of perturbation theory we go, the superficial degree of divergence 
remains the same. Further thought reveals that we are merely formalizing the dimension¬ 
counting argument of the preceding chapter. [Recall that the mass dimension of a Bose 
field [<p] is 1 and of a Fermi field [i/r] is Hence the coefficients 1 and 1 in (11).] 

2 The superficial degree of divergence measures the divergence of the Feynman diagram as all internal 
momenta are scaled uniformly by k —> ak with a tending to infinity. In a more rigorous treatment we have to 
worry about the momenta in some subdiagram (a piece of the full diagram) going to infinity with other momenta 
held fixed. 
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Our discussion hardly amounts to a rigorous proof that theories such as the Yukawa 
theory are renormalizable. If you demand rigor, you should consult the many field theory 
tomes on renormalization theory, and I do mean tomes, in which such arcane topics as 
overlapping divergences and Zimmerman’s forest formula are discussed in exhaustive and 
exhausting detail. 


Nonrenormalizable field theories 

It is instructive to see how nonrenormalizable theories reveal their unpleasant personali¬ 
ties when viewed in the context of this discussion. Consider the Fermi theory of the weak 
interaction written in a simplified form: 

C = xJriiy^d^ - m P )f + Gixjrxjr) 1 . 

The analogs of (6), (7), and (8) now read L = F l — (V — 1), 4V — F E + 2 Fj, and D = 
4 L — Fj. Solving for the superficial degree of divergence in terms of the external number 
of fermion lines, we find 

D = A-\F e + 2V (12) 

Compared to the corresponding equations for renormalizable theories (5) and (11), D 
now depends on V. Thus if we calculate fermion-fermion scattering [F E — 4), for example, 
the divergence gets worse and worse as we go to higher and higher order in the perturbation 
series. This confirms the discussion of the previous chapter. But the really bad news is 
that for any F E , we would start running into divergent diagrams when V gets sufficiently 
large, so we would have to include an unending stream of counterterms (xj/xl/) 2 , (xfrxfr) 4 , 
(t/ct/t) 5 , • • •, each with an arbitrary coupling constant to be determined by an experimental 
measurement. The theory is severely limited in predictive power. 

At one time nonrenormalizable theories were considered hopeless, but they are accepted 
in the modern view based on the effective field theory approach, which I will discuss in 
chapter VIII.3. 


Dependence on dimension 

The superficial degree of divergence clearly depends on the dimension d of spacetime since 
each loop is associated with f d d k. For example, consider the Fermi interaction G(x[nlr) 2 in 
(1 + 1)-dimensional spacetime. Of the three equations that went into (12) one is changed 
to D — 2L — Fj , giving 

D = 2- \F e (13) 


the analog of (12) for 2-dimensional spacetime. In contrast to (12), V no longer enters, and 
the only superficial diagrams have F E — 2 and 4, which we can cancel by the appropriate 
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counterterms. The Fermi interaction is renormalizable in (1 + l)-dimensional spacetime. 
I will come back to it in chapter VII.4. 


The Weisskopf phenomenon 


I conclude by pointing out that the mass correction to a Bose field and to a Fermi field 
diverge differently. Since this phenomenon was first discovered by Weisskopf, I refer to 
it as the Weisskopf phenomenon. To see this go back to (11) and observe that B E and F E 
contribute differently to the superficial degree of divergence D. For B E — 2, F E — 0, we 
have D — 2, and thus the mass correction to a Bose field diverges quadratically, as we have 
already seen explicitly [the quantity a in (3)]. But for F E = 2, B E — 0, we have D = 1 and 
it looks like the fermion mass is linearly divergent. Actually, in 4-dimensional field theory 
we cannot possibly get a linear dependence on the cutoff. To see this, it is easiest to look at, 
as an example, the Feynman integral you wrote down for exercise II.5.1, for the diagram 
in figure II.5.1: 


(if)V I — A 2 l 2 t + \ +m 2 

J (2n) 4 k 2 — n 2 (p + k ) 2 — in 2 


A(p 2 ) p + B(p 2 ) 


(14) 


where I define the two unknown functions A(p 2 ) and B(p 2 ) for convenience. (For the 
purpose of this discussion it doesn’t matter whether we are doing bare or physical pertur¬ 
bation theory. If the latter, then I have suppressed the subscript P for the sake of notational 
clarity.) 

Look at the integrand for large k. You see that the integral goes as / d 4 k{ /t/k 4 ) and 
looks linearly divergent, but by reflection symmetry k —> — k the integral to leading order 
vanishes. The integral in (14) is merely logarithmically divergent. The superficial degree 
of divergence D often gives an exaggerated estimate of how had the divergence can be 
(hence the adjective “superficial”). In fact, staring at (14) we can prove more, for instance, 
that B(p 2 ) must be proportional to m. As an exercise you can show, using the Feynman 
rules given in chapter II.5, that the same conclusion holds in quantum electrodynamics. 

For a boson, quantum fluctuations give oc A 2 /p, while for a fermion such as the 
electron, quantum correction to its mass Sm oc m log(A /m) is much more benign. It is 
interesting to note that in the early twentieth century physicists thought of the electron 
as a ball of charge of radius a. The electrostatic energy of such a ball, of the order e 2 /a, 
was identified as the electron mass. Interpreting 1/a as A, we could say that in classical 
physics the electron mass is proportional to A and diverges linearly. Thus, one way of 
stating the Weisskopf phenomenon is that “bosons behave worse than a classical charge, 
but fermions behave better.” 

As Weisskopf explained in 1939, the difference in the degree of the divergence can be 
understood heuristically in terms of quantum statistics. The “bad” behavior of bosons has 
to do with their gregariousness. A fermion would push away the virtual fermions fluctuat¬ 
ing in the vacuum, thus creating a cavity in the vacuum charge distribution surrounding 
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it. Hence its self-energy is less singular than would be the case were quantum statistics 
not taken into account. A boson does the opposite. 

The “bad” behavior of bosons will come back to haunt us later. 


Power of h counts the number of loops 

This is a convenient place to make a useful observation, though unrelated to divergences 
and cut-off dependence. Suppose we restore Planck’s unit of action h. In the path integral, 
the integrand becomes e' s ^ (recall chapter 1.2) so that effectively C —>• C/h. Consider 
C — —\<p( 9 2 + m 2 )<p — ||<p 4 , just to be definite. The coupling A —> k/h, so that each vertex 
is now associated with a factor of 1/H. Recall that the propagator is essentially the inverse 
of the operator ( 3 2 + m 2 ), and so in momentum space 1 /(k 2 — m 2 ) -» H/(k 2 — m 2 ). Thus 
the powers of H are given by the number of internal lines minus the number of vertices 
P — Bj — V — L — 1, where we used ( 6 ). You can check that this holds in general and not 
just for (p 4 theory. 

This observation shows that organizing Feynman diagrams by the number of loops 
amounts to an expansion in Planck’s constant (sometimes called a semi-classical expan¬ 
sion), with the tree diagrams providing the leading term. We will come across this again 
in chapter IV. 3. 


Exercises 


111.3.1 Show that in (1+ 1)-dimensional spacetime the Dirac field \fs has mass dimension \, and hence the 
Fermi coupling is dimensionless. 

111 * 3.2 Derive (11) and (13). 

111.3.3 Show that B(p 2 ) in (14) vanishes when we set m = 0. Show that the same behavior holds in quantum 
electrodynamics. 

111.3.4 We showed that the specific contribution (14) to 8m is logarithmically divergent. Convince yourself that 
this is actually true to any finite order in perturbation theory. 

11 1.3.5 Show that the result P = L — 1 holds for all the theories we have studied. 



Gauge Invariance: A Photon Can Find No Rest 



When the central identity blows up 


I explained in chapter 1.7 that the path integral for a generic field theory can be formally 
evaluated in what deserves to be called the Central Identity of Quantum Field Theory: 

(1) 

For any field theory we can always gather up all the fields, put them into one giant column 
vector, and call the vector tp . We then single out the term quadratic in <p write it as \ <p • K • cp , 
and call the rest V(cp). I am using a compact notation in which spacetime coordinates and 
any indices on the field, including Lorentz indices, are included in the indices of the formal 
matrix K. We will often use (1) with V — 0 : 


J D( P e- 1 2 ,f, ' K - ,( ’ +J - cf ' = e i Ji 


But what if K does not have an inverse? 

This is not an esoteric phenomenon that occurs in some pathological field theory, but 
in one of the most basic actions of physics, the Maxwell action 


S(A) = J d A xC = J d A x V v - d^d v )A v + . 


The formal matrix K in (2) is proportional to the differential operator (d 2 g flv — d^d v ) = 
Q^ v . A matrix does not have an inverse if some of its eigenvalues are zero, that is, if when 
acting on some vector, the matrix annihilates that vector. Well, observe that Q^ v annihilates 
vectors of the form d v A(x): Q^ lv d v A(x) = 0. Thus Q llv has no inverse. 

There is absolutely nothing mysterious about this phenomenon; we have already en¬ 
countered it in classical physics. Indeed, when we first learned the concepts of electricity, 
we were told that only the “voltage drop” between two points has physical meaning. At 
a more sophisticated level, we learned that we can always add any constant (or indeed 
any function of time) to the electrostatic potential (which is of course just “voltage”) since 
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by definition its gradient is the electric field. At an even more sophisticated level, we see 
that solving Maxwell’s equation (which of course comes from just extremizing the action) 
amounts to finding the inverse Q _1 . [In the notation I am using here Maxwell’s equation 
d )l F IJ - v — J v is written as Q^ V A V — J ;l , and the solution is A v — (<2 _1 ) v/t ■/,,-] 

Well, Q~ l does not exist! What do we do? We learned that we must impose an additional 
constraint on the gauge potential , known as “fixing a gauge.” 


A mundane nonmystery 

To emphasize the rather mundane nature of this gauge fixing problem (which some 
older texts tend to make into something rather mysterious and almost hopelessly dif¬ 
ficult to understand), consider just an ordinary integral dAe~ A ' K ' A , with A = 
(a, b) a 2-component vector and K — ^ q 0 ) ’ a mat ™ without an inverse. Of course 
you realize what the problem is: We have da db e~° 2 and the integral over 

b does not exist. To define the integral we insert into it a delta function S(b — £). 
The integral becomes defined and actually does not depend on the arbitrary num¬ 
ber t. More generally, we can insert S[f(b)\ with / some function of our choice. In 
the context of an ordinary integral, this procedure is of course ludicrous overkill, but 
we will use the analog of this procedure in what follows. In this baby problem, we 
could regard the variable b, and thus the integral over it, as “redundant.” As we will 
see, gauge invariance is also a redundancy in our description of massless spin 1 parti¬ 
cles. 

A massless spin 1 field is intrinsically different from a massive spin 1 field—that’s the 
crux of the problem. The photon has only two polarization degrees of freedom. (You already 
learned in classical electrodynamics that an electromagnetic wave has two transverse 
degrees of freedom.) This is the true physical origin of gauge invariance. 

In this sense, gauge invariance is, strictly speaking, not a “real” symmetry but merely a 
reflection of the fact that we used a redundant description: a Lorentz vector field to describe 
two physical degrees of freedom. 


Restricting the functional integral 

I will now discuss the method for dealing with this redundancy invented by Faddeev and 
Popov. As you will see presently, it is the analog of the method we used in our baby problem 
above. Even in the context of electromagnetism this method is a bit of overkill, but it will 
prove to be essential for nonabelian gauge theories (as we will see in chapter VII.1) and 
for gravity. I will describe the method using a completely general and somewhat abstract 
language. In the next section, I will then apply the discussion here to a specific example. 
If you have some trouble with this section, you might find it helpful to go back and forth 
between the two sections. 
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Suppose we have to do the integral / = f DAe lS{A) ; this can be an ordinary integral 
or a path integral. Suppose that under the transformation A -> A g the integrand and 
the measure do not change, that is, S(A ) = S(A g ) and DA — DA g . The transformations 
obviously form a group, since if we transform again with g ', the integrand and the measure 
do not change under the combined effect of g and g' and A g —»■ ( A g ) g t — A gg r. We would 
like to write the integral I in the form I — (f Dg)J, with J independent of g. In other 
words, we want to factor out the redundant integration over g. Note that Dg is the invariant 
measure over the group of transformations and f Dg is the volume of the group. Be aware 
of the compactness of the notation in the case of a path integral: A and g are both functions 
of the spacetime coordinates x. 

I want to emphasize that this hardly represents anything profound or mysterious. If 
you have to do the integral I — f dx dy e' s ^ x,y ^ with S(x, y ) some function of.r 2 + y 2 , you 
know perfectly well to go to polar coordinates I = (/ dO)J = (2tt)/, where J — J dr re lS(r) 
is an integral over the radial coordinate r only. The factor 2tc is precisely the volume of the 
group of rotations in 2 dimensions. 

Faddeev and Popov showed how to do this “going over to polar coordinates” in a 
unified and elegant way. Following them, we first write the numeral “one” as 
1 = A (A) f Dg8[f (A g )], an equality that merely defines A(A). Here / is some function 
of our choice and A (A), known as the Faddeev-Popov determinant, of course depends on 
/. Next, note that [A(A^)] _1 = / DgS[f(A g , g )] = f Dg"&[f (A g „)] = [A(A)] _1 , where the 
second equality follows upon defining g" = g'g and noting that Dg" = Dg. In other words, 
we showed that A(A) = A (A g ): the Faddeev-Popov determinant is gauge invariant. We 
now insert 1 into the integral I we have to do: 

/ = J DAe isa) 

= J DAe iSiA) A(A) f DgS[f(A g )] 

= f Dg f DAe iS(A) A(A)S[f(A g )] (4) 

As physicists and not mathematicians, we have merrily interchanged the order of 
integration. 

At the physicist’s level of rigor, we are always allowed to change integration variables 
until proven guilty. So let us change A to A g ~ 1 ; then 

I ={yj Dg'j J DAe iS{A) A(A)S[f(A)] (5) 

where we have used the fact that DA, S(A), and A (A) are all invariant under A —> A g ~ 1 . 

That’s it. We’ve done it. The group integration (/ Dg) has been factored out. 

The volume of a compact group is finite, but in gauge theories there is a separate group 
at every point in spacetime, and hence (/ Dg) is an infinite factor. (This also explains 
why there is no gauge fixing problem in theories with global symmetries introduced in 
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chapter 1.10.) Fortunately, in the path integral Z for field theory we do not care about 
overall factors in Z, as was explained in chapter 1.3, and thus the factor ( f Dg) can simply 
be thrown away. 


Fixing the electromagnetic gauge 


Let us now apply the Faddeev-Popov method to electromagnetism. The transformation 
leaving the action invariant is of course A^ —> A^ — d^A, so g in the present context is 
denoted by A and A g = A^ — 3^ A. Note also that since the integral I we started with is 
independent of / it is still independent of / in spite of its appearance in (5). Choose 
f (A) = 3 A — a , where cr is a function of x. In particular, / is independent of a and 
so we can integrate I with an arbitrary functional of cr, in particular, the functional 

e ~(i/ 24 ) f d*xcr(x) 2 

We now turn the crank. First, we calculate 


[A(A)] _1 = J Dg8[f(A g )\ = I DA8(,dA — 3 2 A — a) (6) 

Next we note that in (5) A(A) appears multiplied by <5[/(A)] and so in evaluating [A(A)] -1 
in (6) we can effectively set /(A) = 3A — cr to zero. Thus from (6) we have A (A) “=” 
[/ D A3 (3 2 A)] -1 . But this object does not even depend on A, so we can throw it away. Thus, 
up to irrelevant overall factors that could be thrown away I is just f DAe' S(A> S(dA — cr). 
Integrating over cr (jc) as we said we were going to do, we finally obtain 

Z = f Doe~ (iK) S d * xa(x)1 J DAe iS(A) 8(dA- a) 

= J DAe is(A) ~ (i/2i) / rf4 *< 3A)2 ( 7 ) 

Nifty trick by Faddeev and Popov, eh? 

Thus, 5(A) in (3) is effectively replaced by 


5 e ff (A) = 5(A) - / d 4 x(dA) 2 


2f J 



■ + 


( 8 ) 


and Q^ v by — d 2 g IJ ' v — (1 — l/<i;)3 ,t 3 v or in momentum space Q ^ = —k 2 g IJ ' v + (1 — 
\/£ > )k ll k v , which does have an inverse. Indeed, you can check that 


O llv 


~8vi + (1 — 


1 _ eA 4 

k 2 ~ &4 


Thus, the photon propagator can be chosen to be 


(—0 
k 2 


g v x-0--O k -jt 


in agreement with the conclusion in chapter II.7. 


( 9 ) 
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While the Faddeev-Popov argument is a lot slicker, many physicists still prefer the explicit 
Feynman argument given in chapter II.7. I do. When we deal with the Yang-Mills theory 
and the Einstein theory, however, the Faddeev-Popov method is indispensable, as I have 
already noted. 


A photon can find no rest 

Let us understand the physics behind the necessity for imposing by hand a (gauge fixing) 
constraint in gauge theories. In chapter 1.5 we sidestepped this whole issue of fixing the 
gauge by treating the massive vector meson instead of the photon. In effect, we changed 
Q^ v to (3 2 + m 2 )g llv — d fl d v , which does have an inverse (in fact we even found the inverse 
explicitly). We then showed that we could set the mass m to 0 in physical calculations. 

There is, however, a huge and intrinsic difference between massive and massless 
particles. Consider a massive particle moving along. We can always boost it to its rest 
frame, or in more mathematical terms, we can always Lorentz transform the momentum 
of a massive particle to the reference momentum = m( 1, 0, 0, 0). (As is the case 
elsewhere in this book, if there is no risk of confusion, we write column vectors as row 
vectors for typographical convenience.) To study the spin degrees of freedom, we should 
evidently sit in the rest frame of the particle and study how its states respond to rotation. 
The fancy pants way of saying this is we should study how the states of the particle 
transform under that particular subgroup of the Lorentz group (known as the little group) 
consisting of those Lorentz transformations A that leave q 11 invariant, namely A ^q v = q 11 . 
For q 11 = m( 1, 0, 0, 0), the little group is obviously the rotation group SO{3). We then apply 
what we learned in nonrelativistic quantum mechanics and conclude that a spin j particle 
has ( 2 j + 1 ) spin states (or polarizations in classical physics), as already noted back in 
chapter 1.5. 

But if the particle is massless, we can no longer find a Lorentz boost that would bring 
us to its rest frame. A photon can find no rest! 

For a massless particle, the best we can do is to transform the particle’s momentum to the 
reference momentum q ^ = a>( 1, 0, 0, 1) for some arbitrarily chosen a>. Again, this is just 
a fancy way of saying that we can always call the direction of motion the third axis. What is 
the little group that leaves q^ invariant? Obviously, rotations around the third axis, forming 
the group O (2), leave q ^ invariant. The spin states of a massless particle of any spin around 
its direction of motion are known as helicity states, as was already mentioned in chapter 
II.1. For a particle of spin j, the helicities ±j are transformed into each other by parity 
and time reversal, and thus both helicities must be present if the interactions the particle 
participates in respect these discrete symmetries, as is the case with the photon and the 
graviton . 1 In particular, the photon, as we have seen repeatedly, has only two polarization 
degrees of freedom, instead of three, since we no longer have the full rotation group 50(3). 


1 But not with the neutrino. 
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(You already learned in classical electrodynamics that an electromagnetic wave has two 
transverse degrees of freedom.) For more on this, see appendix B. 

In this sense, gauge invariance is strictly speaking not a “real” symmetry but merely a 
reflection of the fact that we used a redundant description: we used a vector field A /t with 
its four degrees of freedom to describe two physical degrees of freedom. This is the true 
physical origin of gauge invariance. 

The condition A ^q v = q 11 should leave us with a 3-parameter subgroup. To find the other 
transformations, it suffices to look in the neighborhood of the identity, that is, at Lorentz 
transformations of the form A(a, /?) = / + a A + /3B + • • • . By inspection, we see that 


A = 


(0 

1 

0 

0 

\ 



(° 

0 

1 

0 

\ 


1 

0 

0 

-1 




0 

0 

0 

0 








— i(Ki + J 2 ), 

B = 






II 

NJ 

1 

V, 

0 

0 

0 

0 




1 

0 

0 

-1 



\0 

1 

0 

0 

J 



\o 

0 

1 

0 

J 



( 10 ) 


where we used the notation for the generators of the Lorentz group from chapter II.3. Note 
that A and B are to a large extent determined by the fact that J and K are symmetric and 
antisymmetric, respectively. 

By direct computation or by invoking the celebrated minus sign in (II.3.9), we find 
that [A, B] — 0. Also, [/ 3 , A\ = B and [/ 3 , B] — —A so that, as expected, (A, B) form a 
2-component vector under Oil) rotations around the third axis. (For those who must 
know, the generators A, B, and 7 3 generate the group 750(2), the invariance group of 
the Euclidean 2-plane, consisting of two translations and one rotation.) 

The preceding paragraph establishing the little group for massless particles applies 
for any spin, including zero. Now specialize to a spin 1 massless particle with the two 
polarization vectors e^iq) = (l/\/2)(0, 1, ± 1 ,0). The polarization vectors are defined by 
how they transform under rotation So it is natural to ask how e^iq) transform under 
A (or, /J). Inspecting (10), we see that 

e ± (q)^e ± (q)+ ^-ia±iP)q (11) 


We recognize (11) as a gauge transformation (as was explained in chapter II.7). For a mass¬ 
less spin 1 particle, the gauge transformation is contained in the Lorentz transformations! 

Suppose we construct the corresponding spin 1 field as in chapter II.5 [and in analogy 
to (1.8.11) and (II.2.10)]: 


A,Ax') = 


/ 


d 5 k 


yj(2n) l 2co k 


J2 [« 

<*= 1,2 


(a \k)£ (a \k)e- i(a)kt - k ^ + a^ a \k)e* ia \k)e i(Q,kt - kmi)) ] 


( 12 ) 


with co k — |k|. The polarization vectors e ( ®Hk) are of coursed determined by the condition 
k^(k) — 0, which we could easily satisfy by defining e {a \k) — A(q -» k)s^ a \q), where 
A (q -> k) denotes a Lorentz transformation that brings the reference momentum q to k. 

Note that the £ (a, (k) thus constructed has a vanishing time component. (To see this, 
first boost e^\q) along the third axis and then rotate, for example.) Hence, k^ L s^\k) — 
—k ■ s^ik) — 0. These properties of s^ a \k) translate into A 0 (x) — 0 and V ■ A(x) = 0. 

/X 
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These two constraints cut the four degrees of freedom contained in A^(x) down to two 
and fix what is known as the Coulomb or radiation gauge. 2 

Given the enormous importance of gauge invariance, it might be instructive to review 
the logic underlying the “poor man’s approach” to gauge invariance (which, as I mentioned 
in chapter 1.5,1 learned from Coleman) adopted in this book for pedagogical reasons. You 
could have fun faking Feynman’s mannerism and accent, saying, “Aw shucks, all that fancy 
talk about little groups! Who needs it? Those experimentalists won’t ever be able to prove 
that the photon mass is mathematically zero anyway.” 

So start, as in chapter 1.5, with the two equations needed for describing a spin 1 massive 
particle: 

(9 2 + m 2 )A M = 0 (13) 

and 

V 4 " = 0 (14) 

Equation (14) is needed to cut the number of degrees of freedom contained in A^ down 
from four to three. 

Lo and behold, (13) and (14) are equivalent to the single equation 

a^(a M A„-a„A M ) + /n 2 A„ = o (15) 

Obviously, (13) and (14) together imply (15). To verify that (15) implies (13) and (14), we 
act with 3 v on (15) and obtain 

m 2 dA = 0 (16) 

which for m ^ 0 requires 3 A — 0, namely (14). Plugging this into (15) we obtain (13). 
Having packaged two equations into one, we note that we can derive this single equation 


(15) by varying the Lagrangian 

C = --F uv F tlv + -m 2 A 2 (17) 

4 M 2 

with = d^A v - d v A /U . 

Next, suppose we include a source J for this particle by changing the Lagrangian to 
C = -\F llv F» v +±m 2 A 2 + A ll Ji* (18) 

with the resulting equation of motion 

d^d^-dvA^ + j^A^-J, (19) 

But now observe that when we act with 3 V on (19) we obtain 

m 2 dA = —dJ (20) 


2 For a much more detailed and leisurely discussion, see S. Weinberg, Quantum Theory of Fields, pp. 69-74 
and 246-255. 
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We recover (14) only if d fl J 11 = 0, that is, if the source producing the particle, commonly 
know as the current, is conserved. 

Put more vividly, suppose the experimentalists who constructed the accelerator (or 
whatever) to produce the spin 1 particle messed up and failed to insure that = 0; 
then 3 A ^ 0 and a spin 0 excitation would also be produced. To make sure that the beam 
of spin 1 particles is not contaminated with spin 0 particles, the accelerator builders must 
assure us that the source J in the Lagrangian (18) is indeed conserved. 

Now, if we want to study massless spin 1 particles, we simply set m — 0 in (18). The 
“poor man” ends up (just like the “rich man”) using the Lagrangian 

C = -^ lv F^ + A IJ J» ( 21 ) 

to describe the photon. Lo and behold (as we exclaimed in chapter II.7), £ is left invariant 
by the gauge transformation A^ -> A^ — 3^ A for any A(x). (As was also explained in that 
chapter, the third polarization decouples in the limit m —> 0.) The “poor man” has thus 
discovered gauge invariance! 

However, as I warned in chapter 1.5, depending on his or her personality, the poor man 
could also wake up in the middle of the night worrying that physics might be discontinuous 
in the limit m -* 0. Thus the little group discussion is needed to remove that nightmare. 
But then a “real” physicist in the Feynman mode could always counter that for any physical 
measurement everything must be okay as long as the duration of the experiment is short 
compared to the characteristic time 1/m. More on this issue in chapter VIII.1. 


A reflection on gauge symmetry 

As we will see later and as you might have heard, much of the world beyond electro¬ 
magnetism is also described by gauge theories. But as we saw here, gauge theories are 
also deeply disturbing and unsatisfying in some sense: They are built on a redundancy 
of description. The electromagnetic gauge transformation A^ —>• A^ — 3^ A is not truly a 
symmetry stating that two physical states have the same properties. Rather, it tells us that 
the two gauge potentials A^ and A^ — 3^ A describe the same physical state. In your or¬ 
derly study of physics, the first place where A^ becomes indispensable is the Schrodinger 
equation, as I will explain in chapter IV.4. Within classical physics, you got along perfectly 
well with just E and B . Some physicists are looking for a formulation of quantum elec¬ 
trodynamics without using A /i , but so far have failed to turn up an attractive alternative 
to what we have. It is conceivable that a truly deep advance in theoretical physics would 
involve writing down quantum electrodynamics without writing A fl . 




Field Theory without Relativity 


Slower in its maturity 


Quantum field theory at its birth was relativistic. Later in its maturity, it found applications 
in condensed matter physics. We will have a lot more to say about the role of quantum field 
theory in condensed matter, but for now, we have the more modest goal of learning how 
to take the nonrelativistic limit of a quantum field theory. 

The Lorentz invariant scalar field theory 

C =(3<D t )(9<D) - m 2 <J>f<t> - A(4>" f <I>) 2 (1) 


(with X > 0 as always) describes a bunch of interacting bosons. It should certainly contain 
the physics of slowly moving bosons. For clarity consider first the relativistic Klein-Gordon 
equation 


(9 2 + m 2 )<t> = 0 


( 2 ) 


for a free scalar field. A mode with energy E — m + s would oscillate in time as 4> oc e~ lEt . 
In the nonrelativistic limit, the kinetic energy e is much smaller than the rest mass 
m. It makes sense to write $(.*, t) = e~ ,mt (p{x, t), with the field <p oscillating in time 
much more slowly than e~ lml . Plugging into (2) and using the identity (d/dt)e~ lm, (- ■ •) = 
e -‘ m, (_i m + 3/3r)(- • •) twice, we obtain (—im + d/dt) 2 cp — S7 2 cp + m 2 qi — 0. Dropping the 
term (d 2 /dt 2 )(p as small compared to —2 im(d/dt)(p, we find Schrodinger’s equation, as we 
had better: 


9 V 2 

i — <p = - £p 

dt 2m 


(3) 


By the way, the Klein-Gordon equation was actually discovered before Schrodinger’s 
equation. 
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Having absorbed this, you can now easily take the nonrelativistic limit of a quantum 
field theory. Simply plug 

<t>(x,t)= —^=e~‘ mt ip{x,t) (4) 

V2 m 


into (1) . (The factor 1/V2 m is for later convenience.) For example. 


3 <t> * 3 <t> n , t, 1 

- m <&'<■ t> — 


3 1 3 1 


2m 


im + — )<p 
dt, 

t 


-im + — )<p 
dt 


2 T 

— Ill <p ip 


1. I 13 <p 3 <p 

--i f- <p 

2 \ dt dt 


After an integration by parts we arrive at 

C = i<p Jf d 0 (p - ^-di^dfip - g 2 (y>V) 2 

2m 


(5) 

( 6 ) 


where g 2 — X/Am 2 . 

As we saw in chapter 1.10 the theory (1) enjoys a conserved Noether current 7 /( = 
; (d ) T3 /i <t > — f <J>). The density J 0 reduces to (p'tp, precisely as you would expect, while 
7; reduces to (i/2m)(<p'djtp — d^tp). When you first tookacourse in quantum mechanics, 
didn’t you wonder why the density p = (p' f tp and the current /,■ = ( i/2m)((p'dj(p — dj(p f (p) 
look so different? As to be expected, various expressions inevitably become uglier when 
reduced from a more symmetric to a less symmetric theory. 


Number is conjugate to phase angle 


Let me point out some differences between the relativistic and nonrelativistic case. 

The most striking is that the relativistic theory is quadratic in time derivative, while 
the nonrelativistic theory is linear in time derivative. Thus, in the nonrelativistic theory 
the momentum density conjugate to the field <p, namely SC/Sd^cp, is just i<p', so that 
[tp^ix, t), <p(x', f)] = —8 (D \x — x f ). In condensed matter physics it is often illuminating to 
write tp — y/pe' 0 so that 


d= ‘-d 0 p- pd 0 9- 2 > 

2 2m 


p(d i e) 2 +— o,.p) 2 
Ap 


2 2 
g p 


( 7 ) 


The first term is a total divergence. The second term tells us something of great impor¬ 
tance 1 in condensed matter physics: in the canonical formalism (chapter 1.8), the momen¬ 
tum density conjugate to the phase field 0(x) is 8C/8d 0 6 — —p and thus Heisenberg tells 
us that 


\p(x, t), 9(x', t )] = i8^ D \x — x') 


( 8 ) 


1 See P. Anderson, Basic Notions of Condensed Matter Physics, p. 235. 
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Integrating and defining N = f d D xp(x, t) = the total number of bosons, we find one of 
the most important relations in condensed matter physics 

lff.0] = i (9) 

Number is conjugate to phase angle, just as momentum is conjugate to position. Marvel 
at the elegance of this! You would learn in a condensed matter course that this fundamental 
relation underlies the physics of the Josephson junction. 

You may know that a system of bosons with a “hard core” repulsion between them is a 
superfluid at zero temperature. In particular, Bogoliubov showed that the system contains 
an elementary excitation obeying a linear dispersion relation. 2 I will discuss superfluidity 
in chapter V.l. 

In the path integral formalism, going from the complex field cp — q>y + i(p 2 to p and 
0 amounts to a change of integration variables, as I remarked back in chapter 1.8. In the 
canonical formalism, since one deals with operators, one has to tread with somewhat more 
finesse. 


The sign of repulsion 

In the nonrelativistic theory (7) it is clear that the bosons repel each other: Piling particles 
into a high density region would cost you an energy density g 2 p 2 . But it is less clear in 
the relativistic theory that k(<J>' <J>) 2 with A. positive corresponds to repulsion. I outline one 
method in exercise III.5.3, but here let’s just take a flying heuristic guess. The Hamiltonian 
(density) involves the negative of the Lagrangian and hence goes as 7.(<f> i <t>) 2 for large <t> 
and would thus be unbounded below for X < 0. We know physically that a free Bose gas 
tends to condense and clump, and with an attractive interaction it surely might want to 
collapse. We naturally guess that X > 0 corresponds to repulsion. 

I next give you a more foolproof method. U sing the central identity of quantum field 
theory we can rewrite the path integral for the theory in (1) as 

Z = J D^Dae 1 / ^[(ai-hooj-m^ti+ZffOto+o/xja 2 ] ^ 

Condensed matter physicists call the transformation from (1) to the Lagrangian C = 
(9<td)(9<J>) — m 2 <t>'<t> + 2a <J> 10 + (1 /X)a 2 the Hubbard-Stratonovich transformation. In 
field theory, a field that does not have kinetic energy, such as a , is known as an auxiliary field 
and can be integrated out in the path integral. When we come to the superfield formalism 
in chapter VIII.4, auxiliary fields will play an important role. 

Indeed, you might recall from chapter III.2 how a theory with an intermediate vector 
boson could generate Fermi’s theory of the weak interaction. The same physics is involved 
here: The theory (10) in which the <t> field is coupled to an “intermediate a boson” can 
generate the theory (1). 


2 For example, L.D. Landau and E. M. Lifschitz, Statisical Physics, p. 238. 
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If a were a “normal scalar field” of the type we have studied, that is, if the terms quadratic 
in a in the Lagrangian had the form t(3er) 2 — \M 2 a 2 , then its propagator would be 
i /( k 2 — M 2 + is). The scattering amplitude between two O bosons would be proportional 
to this propagator. We learned in chapter 1.4 that the exchange of a scalar field leads to an 
attractive force. 

But a is not a normal field as evidenced by the fact that the Lagrangian contains 
only the quadratic term +(1/A)er 2 . Thus its propagator is simply i /(1/A) — iX, which (for 
X > 0) has a sign opposite to the normal propagator evaluated at low-momentum transfer 
i /( k 2 — M 2 + is) ~ —i / M 2 . We conclude that a exchange leads to a repulsive force. 

Incidentally, this argument also shows that the repulsion is infinitely short ranged, like 
a delta function interaction. Normally, as we learned in chapter 1.4 the range is determined 
by the interplay between the k 2 and the M 2 terms. Here the situation is as if the M 2 term 
is infinitely large. We can also argue that the interaction A(<l> ' <t>) 2 involves creating two 
bosons and then annihilating them both at the same spacetime point. 


Finite density 

One final point of physics that people trained as particle physicists do not always remem¬ 
ber: Condensed matter physicists are not interested in empty space, but want to have a 
finite density p of bosons around. We learned in statistical mechanics to add a chemical 
potential term /up'cp to the Lagrangian (6). Up to an irrelevant (in this context!) additive 
constant, we can rewrite the resulting Lagrangian as 

C = »> t 3 0 ¥> - - g 2 {<p T <p - p ) 2 (11) 

Am 

Amusingly, mass appears in different places in relativistic and nonrelativistic field 
theories. To proceed further, I have to develop the concept of spontaneous symmetry 
breaking. Thus, adios for now. We will come back to superfluidity in due time. 


Exercises 


111 . 5.1 Obtain the Klein-Gordon equation for a particle in an electrostatic potential (such as that of the nucleus) 
by the gauge principle of replacing (3/30 i n (2) by 3/3 1 — ieA 0 . Show that in the nonrelativistic limit 
this reduces to the Schrodinger’s equation for a particle in an external potential. 


11 1.5.2 Take the nonrelativistic limit of the Dirac Lagrangian. 

111 .5.3 Given a field theory we can compute the scattering amplitude of two particles in the nonrelativistic limit. 
We then postulate an interaction potential U(x) between the two particles and use nonrelativistic quan¬ 
tum mechanics to calculate the scattering amplitude, for example in Born approximation. Comparing 
the two scattering amplitudes we can determine U(x). Derive the Yukawa and the Coulomb potentials 
this way. The application of this method to the • <E>) 2 interaction is slightly problematic since the 
delta function interaction is a bit singular, but it should be all right for determining whether the force is 
repulsive or attractive. 




The Magnetic Moment of the Electron 


Dirac’s triumph 

I said in the preface that the emphasis in this book is not on computation, but how can I 
not tell you about the greatest triumph of quantum field theory? 

After Dirac wrote down his equation, the next step was to study how the electron interacts 
with the electromagnetic field. According to the gauge principle already used to write 
the Schrodinger’s equation in an electromagnetic field, to obtain the Dirac equation for 
an electron in an external electromagnetic field we merely have to replace the ordinary 
derivative 3 /( by the covariant derivative D^ = 3^ — ieA^ : 

(iy^D^ — m)i/r = 0 (1) 

Recall (II.1.27). 

Acting on this equation with (iy^D^ + m), we obtain —(y IJ y v D /t D v + m 2 )ijr = 0. We 
have y^y v D^D v = \{{y», y 1 '} + [y**, y v ])D fl D v = D^D' 1 - io^'D^D v and ia^D^D v = 
(i/2)aH^V D v\ = (e/Do^F^. Thus 

^D lx D»- e -a» v F lxv +m 2 ^f = 0 ( 2 ) 

Now consider a weak constant magnetic field pointing in the 3rd direction for definite¬ 
ness, weak so that we can ignore the (A,) 2 term in (D,) 2 . By gauge invariance, we can 
choose A 0 = 0, A 1 = — \Bx 2 , and A 2 = 2 Bx A (so that F 12 = 3jA 2 — 3 2 A 1 = B). As we will 
see, this is one calculation in which we really have to keep track of factors of 2. Then 

(A) 2 = O,) 2 - ieOiAi + Aft) + 0(A 2 ) 

= (3,) 2 - ^B( X \ - x\) + 0(A ?) 

2 

= V 2 -eB -x x p+ O(Af) (3) 

Note that we used 3,■ A,- + A,-3,- = (3,-A,) + 2A,3 ; - = 2A,3,-, where in (3, A,) the partial deriva¬ 
tive acts only on A,-. You may have recognized L = x. x p as the orbital angular momentum 




III.6. Magnetic Moment of Electron | 195 


operator. Thus, the orbital angular momentum generates an orbital magnetic moment that 
interacts with the magnetic field. 

This calculation makes good physical sense. If we were studying the interaction of a 
charged scalar field <J> with an external electromagnetic field we would start with 

+ ra 2 )<D = 0 (4) 


obtained by replacing the ordinary derivative in the Klein-Gordon equation by covariant 
derivatives. We would then go through the same calculation as in (3). Comparing (4) with 
( 2 ) we see that the spin of the electron contributes the additional term (e/ 2 )cr^ ,v F flv . 


As in chapter II.1 we write \p — ^ ^ ^ in the Dirac basis and focus on cp since in the 

-^.Thus 


a 

0 


nonrelativistic limit it dominates /. Recall that in that basis a 1 ' = s'-' k ^ ' 

(e/2)a^ v F^ v acting on cp is effectively equal to (e/2)<7 3 (F 12 — F 2 \) = (e/2)2a i B = 2eB ■ S 
since S = {a/ 2). Make sure you understand all the factors of 2! Meanwhile, according to 
what I told you in chapter II.1, we should write (p — e~" nt 'i>, where *k oscillates much more 


slowly than e " nt 
we have 


so that (3 q + m 2 )e 


2 \ n —imt 


'k ~ e~ 


[—2/m(3/3t)'k]. Putting it all together, 



-eB-(L + IS) 


= 0 


(5) 


There you have it! As if by magic, Dirac’s equation tells us that a unit of spin angular 
momentum interacts with a magnetic field twice as much as a unit of orbital angular 
momentum, an observational fact that had puzzled physicists deeply at the time. The 
calculation leading to (5) is justly celebrated as one of the greatest in the history of physics. 

The story is that Dirac did not do this calculation until a day after he discovered his 
equation, so sure was he that the equation had to be right. Another version is that he 
dreaded the possibility that the magnetic moment would come out wrong and that Nature 
would not take advantage of his beautiful equation. 

Another way of seeing that the Dirac equation contains a magnetic moment is by the 
Gordon decomposition, the proof of which is given in an exercise: 


«(p')P m m(p) = u(p') 


ip' + pV l + ia^jp' - p) v 
2m 2m 


u(p) 


( 6 ) 


Looking at the interaction with an electromagnetic field u(p l )y ll u(p)A^(p l — p), we see 
that the first term in ( 6 ) only depends on the momentum (p' + pY and would have 
been there even if we were treating the interaction of a charged scalar particle with the 
electromagnetic field to first order. The second term involves spin and gives the mag¬ 
netic moment. One way of saying this is that uip^y^uip) contains a magnetic moment 
component. 
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The anomalous magnetic moment 

With improvements in experimental techniques, it became clear by the late 1940’s that the 
magnetic moment of the electron was larger than the value calculated by Dirac by a factor 
of 1.00118 ± 0.00003. The challenge to any theory of quantum electrodynamics was to 
calculate this so-called anomalous magnetic moment. As you probably know, Schwinger’s 
spectacular success in meeting this challenge established the correctness of relativistic 
quantum field theory, at least in dealing with electromagnetic phenomena, beyond any 
doubt. 

Before we plunge into the calculation, note that Lorentz invariance and current conser¬ 
vation tell us (see exercise III.6.3) that the matrix element of the electromagnetic current 
must have the form (here \p, s) denotes a state with an electron of momentum p and 
polarization j) 

(p', s'\ 7 ^( 0 ) \p, s) = u(p', s') y 11 F-^iq 2 ) 4- — —F 2 (q 2 ) u(p,s) (7) 

2m 

where q = ( p' — p). The functions F 2 {q 2 ) and F 2 (q 2 ), about which Lorentz invariance can 
tell us nothing, are known as form factors. To leading order in momentum transfer q , (7) 
becomes 

u(p', s') J ^' + ^V tO) + ^%Fi(0) + F 2 (0)]1 u(p, s) 

[ 2m 2m J 

by the Gordon decomposition. The coefficient of the first term is the electric charge 
observed by experimentalists and is by definition equal to 1. (To see this, think of potential 
scattering, for example. See chapter II.6.) Thus /q(0) = 1. The magnetic moment of the 
electron is shifted from the Dirac value by a factor 1 + F 2 ( 0). 

Schwinger’s triumph 

Let us now calculate F 2 ( 0) to order a = e 2 /ATt. First draw all the relevant Feynman dia¬ 
grams to this order (fig. III.6.1). Except for figure lb, all the Feynman diagrams are clearly 
proportional to u(p r , s f )y^u(p, s) and thus contribute to Fi(q 2 ), which we don’t care about. 
Happy are we! We only have to calculate one Feynman diagram. 



(a) (b) (c) (d) (e) 

Figure III.6.1 
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It is convenient to normalize the contribution of figure lb by comparing it to the lowest 
order contribution of figure la and write the sum of the two contributions as u(y IJ + T^)u. 
Applying the Feynman rules, we find 


f d * k 

J (2n) 4 k 2 


ley 


p'+ ft — m fi+ ft — m 


tey v 


( 8 ) 


I will now go through the calculation in some detail not only because it is important, but 
also because we will be using a variety of neat tricks springing from the brilliant minds of 
Schwinger and Feynman. You should verify all the steps of course. 

Simplifying somewhat we obtain T M = — ie 2 f [d 4 k/(2jt) 4 ](N 11 / D), where 


N 12 = y v (/i'+ k + m )V lx (P+ k + m )Yv 


(9) 


and 

— =------- -—=2 j da dfi —. (10) 

D (p'+ k) 1 2 — m 2 (p + k) 2 — m 2 k 2 J V 

We have used the identity (D.16). The integral is evaluated over the triangle in the (a-ft) 
plane bounded by a — 0, (3 = 0, and a + ft — 1, and 

V =[k 2 + 2k{ap' + pp)f = [ll 2 - (a + P) 2 m 2 f + 0(q 2 ) (11) 


where we completed a square by defining k — l — (ap f + ftp). The momentum integration 
is now over d 4 l. 

Our strategy is to massage N^ into a form consisting of a linear combination of y 11 , 
and p' 11 . Invoking the Gordon decomposition (6) we can write (7) as 

u jy M [Fi(<7 2 ) + F 2 (q 2 )] - ^~{p’ + p)^ 2 (, 2 )} u 

Thus, to extract F 2 { 0) we can throw away without ceremony any term proportional to y 11 
that we encounter while massaging N 11 . So, let’s proceed. 

Eliminating k in favor of / in (9) we obtain 


A^ = }A[/+ + P + m]y v 


( 12 ) 


where P = (1 — a)p ,fl — ftp 11 and P 12 = (1 — P)p tl — ap' /l . I will use the identities in 
appendix D repeatedly, without alerting you every time I use one. It is convenient to 
organize the terms in N 12 by powers of m. (Here I give up writing in complete grammatical 
sentences.) 


1. The m 2 term: ay 12 term, throw away. 

2. The m terms: organize by powers of /. The term linear in / integrates to 0 by symmetry. 
Thus, we are left with the term independent of/: 

m(y v P'y^yv + y v y» fy v ) = 4m[(l - 2a)p'^ + (1 - 2 ftp*] 

-»■ 4m(1 -a - p){p' + (13) 

In the last step I used a handy trick; since V is symmetric under a <—*■ p , we can sym¬ 
metrize the terms we get in A f#1 . 
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3. Finally, the most complicated m° term. The term quadratic in /: note that we can effectively 
replace l a l z inside / d 4 l/{2 ji) 4 by \ if r l 2 by Lorentz invariance (this step is possible because 
we have shifted the integration variable so that V is a Lorentz invariant function of/ 2 .) Thus, 
the term quadratic in / gives rise to a y ^ term. Throw it away. Again we throw away the term 
linear in /, leaving [use (D.6) here!] 

y v Vy* PYv = -2 /V 1 f 

—2[(1 - 0) p - a/wJy^Kl -a)p' - Pm\ (14) 

where in the last step we remembered that r ;l is to be sandwiched between u (//) and u (p ). 
Again, it is convenient to organize the terms in (14) by powers of m. With the various tricks 
we have already used, we find that the m 2 term can be thrown away, the m term gives 
2m(p' + p) M [a(l — a) + ft(l — fi)\, and the m° term gives 2m(p' + p)^[— 2(1 — a)(l — /?)]. 
Putting it altogether, we find that -*■ 2m(p' + p) M (a + P)(l — a — ft) 


We can now do the integral f [d 4 l/(2 tc) 4 ](1/T>) using (D. 11). Finally, we obtain 


T ' 1 = —lie' 

„2 


/ 


dctdPi^^) 


32jt 2 (a + ft) 2 m 2 


-/V M 


= -£-^ +pr 
and thus, trumpets please: 
2 

p Of 

F2(0) =^^ 


(15) 


(16) 


Schwinger’s announcement of this result in 1948 had an electrifying impact on the theo¬ 
retical physics community. 

I gave you in this chapter not one, but two, of the great triumphs of twentieth century 
physics, although admittedly the first is not a result of field theory per se. 


Exercises 


111.6.1 Evaluate u(p'){ p)u{p) in two different ways and thus prove Gordon decomposition. 

111 . 6.2 Check that (7) is consistent with current conservation. [Hint: By translation invariance (we suppress the 
spin variable) 

(pi j»{x) ip) = <pi y'ho) i P ) e dp'-p )x 

and hence 

(pi V'hx) Ip) = i(p' - p) /i (pi /'*«)) I p)edP'-P* 

Thus current conservation implies that q^{p'\ 7 ,l (0) |p) = 0.] 

111. 6.3 By Lorentz invariance the right hand side of (7) has to be a vector. The only possibilities are uy^u, 
(p + pYuu, and (p — p'^uu . The last term is ruled out because it would not be consistent with current 
conservation. Show that the form given in (7) is in fact the most general allowed. 
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III. 6.4 In chapter II.6, when discussing electron-proton scattering, we ignored the strong interaction that the 
proton participates in. Argue that the effects of the strong interaction could be included phenomenolog¬ 
ically by replacing the vertex u{P, S)y fJ 'u(p, s ) in (II.6.1) by 


(P,S | J fl (0)\p,s) = u(P,S) 


Y iL m 1 )+ i ^^F 2 {q 2 ) 

Zm 


u(p, s ) 


(17) 


Careful measurements of electron-proton scattering, thus determining the two proton form factors 
Fi(q 2 ) and F 2 (q 2 ), earned R. Hofstadter the 1961 Nobel Prize. While we could account for the general 
behavior of these two form factors, we are still unable to calculate them from first principles (in contrast 
to the corresponding form factors for the electron.) See chapters IV.2 and VI1.3. 



7 Polarizing the Vacuum and 
Renormalizing the Charge 


A photon can fluctuate into an electron and a positron 

One early triumph of quantum electrodynamics is the understanding of how quantum 
fluctuations affect the way the photon propagates. A photon can always metamorphose into 
an electron and a positron that, after a short time mandated by the uncertainty principle, 
annihilate each other becoming a photon again. The process, which keeps on repeating 
itself, is depicted in figure III.7.1. 

Quantum fluctuations are not limited to what we just described. The electron and 
positron can interact by exchanging a photon, which in turn can change into an electron 
and a positron, and so on and so forth. The full process is shown in figure III.7.2, where the 
shaded object, denoted by i n /iy ( q ) and known as the vacuum polarization tensor, is given 
by an infinite number of Feynman diagrams, as shown in figure III.7.3. Figure III.7.1 is 
obtained from figure III. 7 . 2 by approximating i U^ v (q) by its lowest order diagram. 

It is convenient to rewrite the Lagrangian C — \js[ — ieA — m]\jt — \F^ lv F^ LV by 
letting A -> (1 /e)A, which we are always allowed to do, so that 

£ = Wy ^ - /A m ) - mfyr - -^V tv (!) 

Note that the gauge transformation leaving C invariant is given by i/r — > e' a \lr and A^ — > 
Afx + BfiOi- The photon propagator (chapter III.4), obtained roughly speaking by inverting 
(l/4e 2 )F flv F l * v , is now proportional to e 2 : 

■n / > —ie 2 r.QnQv 

iDjxvici )— — (1 £) T— (2) 

q 1 L q 2 J 

Every time a photon is exchanged, the amplitude gets a factor of e 2 . This is just a trivial 
but convenient change and does not affect the physics in the slightest. For example, in the 
Feynman diagram we calculated in chapter II.6 for electron-electron scattering, the factor 
e 2 can be thought of as being associated with the photon propagator rather than as coming 
from the interaction vertices. In this interpretation e 2 measures the ease with which the 
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Figure III.7.1 
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Figure III.7.3 


photon propagates through spacetime. The smaller e 2 , the more action it takes to have 
the photon propagate, and the harder for the photon to propagate, the weaker the effect of 
electromagnetism. 

The diagrammatic proof of gauge invariance given in chapter II.7 implies that 
q^Tl pv (q) — 0. Together with Lorentz invariance, this requires that 

rV.O?) = i<7,Wu - gnvq 2 )n(q 2 ) (3) 


The physical or renormalized photon propagator as shown in figure III.7.2 is then given 
by the geometric series 


iD Hv ( ^ = iD nv(q ) + iD IMl (q)in kp (q)iD pv (q) 

+ i D ni(q)i nAp (q)i Dp a (q)in alc (q)i D KV (q) + 

e 2 

,2 

„2 


= -^rSixv {! - e 2 n(q 2 ) + [e 2 n(q 2 )] 2 +•••} + q^v term 


—ie 

t 8llV 

q L 


l + e 2 n (q 2 ) 


+ qu,q v term 


(4) 


Because of (3) the (1 — ^)(qpHiJq 2 ) part of D pl (q) is annihilated when it encounters 
n Ap (< 7 ). Thus, in i D^ y (q) the gauge parameter £ enters only into the q^q v term and drops 
out in physical amplitudes, as explained in chapter II.7. 



202 | III. Renormalization and Gauge Invariance 


The residue of the pole in i D^ v (q) is the physical or renormalized charge squared: 


2 2 
e» — e 


l + e 2 n(0) 


( 5 ) 


Respect for gauge invariance 


In order to determine e R in terms of e, let us calculate to lowest order 
d 4 p 


in^(q) = (-) 


/ 


(2tt) 4 


tr iy v - 


-ly' 


( 6 ) 


ft + ^ — m ft — m 

For large p the integrand goes as 1/p 2 with a subleading term going as m 2 /p 4 causing 
the integral to have a quadratically divergent and a logarithmically divergent piece. (You 
see, it is easy to slip into bad language.) Not a conceptual problem at all, as I explained 
in chapter III.l. We simply regularize. But now there is a delicate point: Since gauge 
invariance plays a crucial role, we must make sure that our regularization respects gauge 
invariance. 

In the Pauli-Villars regularization (III.1.13) we replace (6) by 
d 4 p 


,n flv (q) = (- 


- J2 c “ tr ( iy 


- iy^ - 

ft +(/ — m ft — m 


iy h 


(7) 


ft + 4 ~ m a ft- m a 

Now the integrand goes as (1 — c a )(l/p 1 ) with a subleading term going as (m 2 — 
c a m 2 )(l/p 4 ), and thus the integral would converge if we choose c a and m a such that 


J2 C “ = 1 

a 

and 


( 8 ) 


J2 c “ m l ='" 2 ( 9 > 

a 

Clearly, we have to introduce at least two regulator masses. We are confessing to ignorance 
of the physics above the mass scale m a . The integral in (7) is effectively cut off when the 
momentum p exceeds m a . 

Does a bell ring for you? It should, as this discussion conceptually parallels that in the 
appendix to chapter 1.9. 

The gauge invariant form (3) we expect to get actually suggests that we need fewer 
regulator terms than we think. Imagine expanding (6) in powers of q. Since 

n MV (?) = - g^ v 9 2 )[n(0) + • • •] 

we are only interested in terms of O {q 2 ) and higher in the Feynman integral. If we expand 
the integrand in (6), we see that the term of 0{q 2 ) goes as 1/p 4 for large p, thus giving a 
logarithmically divergent (speaking bad language again!) contribution. (Incidentally, you 
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may recall that this sort of argument was also used in chapter III.3.) It seems that we need 
only one regulator. This argument is not rigorous because we have not proved that n(</ 2 ) 
has a power series expansion in 1 q 2 , but instead of worrying about it let us proceed with 
the calculation. 

Once the integral is convergent, the proof of gauge invariance given in chapter II.7 now 
goes through. Let us recall briefly how the proof went. In computing q M n ;tl , (q ) we use the 
identity 

1 , 1 1 1 

- {l -=- 

p + 4 — m p — m p — m p + 4 — m 

to split the integrand into two pieces that cancel upon shifting the integration variable 
p —>■ p + q. Recall from exercise (II.7.2) that we were concerned that in some cases the 
shift may not be allowed, but it is allowed if the integral is sufficiently convergent, as is 
indeed the case now that we have regularized. In any event, the proof is in the eating of 
the pudding, and we will see by explicit calculation that v (q) indeed has the form in (3). 


Having learned various computational tricks in the previous chapter you are now ready 
to tackle the calculation. I will help by walking you through it. In order not to clutter up 
the page I will suppress the regulator terms in (7) in the intermediate steps and restore 
them toward the end. After a few steps you should obtain 


'TV(?) 


f d 4 p N„ v 

J (2n) 4 ~D~ 


where N^ v = tr [y v ( p+ 4 + m)y tJL { p + m )] and 


1 

D 



with V — [I 2 + a(l — a)q 2 — m 2 + is] 2 , where I — p + aq. Eliminating p in favor of I and 
beating on N^ v you will find that N^ is effectively equal to 


-4 + «(1 - “)(2 q„q v - g^q 2 ) ~ m 2 g^ lv 

Integrate over l using (D.12) and (D. 13) and, writing the contribution from the regulators 
explicitly, obtain 


n uv (q) = - 


where 
/>( m) 

1 


~£jf‘ 


r 1 

' 

/ da 

Jo 

2>('W) - J2 c a F iiv(m a ) 

a 


= -g/xv { A - 2 [m - a(1- a)q ] log 


A 2 


- [a( 1 - a)(2 q/1 q v - g^q 2 ) - m 2 g^ lv \ 


m 2 — a(l — a)q 2 
A 2 


log ■ 


+ m 2 — ce(l — a)q 2 


- 1 


( 10 ) 


( 11 ) 


m 2 — a( 1 — a)q 2 

Remember, you are doing the calculation; I am just pointing the way. In appendix D, A was 
introduced to give meaning to various divergent integrals. Since our integral is convergent. 
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we should not need A, and indeed, it is gratifying to see that in (10) A drops out thanks 
to the conditions (8) and (9). Some other terms drop out as well, and we end up with 
1 /*! 

n = g,iv<l 2 ) J o daa(l-a) 

(log[/H 2 - a(l - a)q 2 ] - ^ c a log {m 2 - a( 1 - a)q 2 \} (12) 

a 

Lo and behold! The vacuum polarization tensor indeed has the form = (q fl q v — 

gn V q 2 )Il(q 2 ). Our regularization scheme does respect gauge invariance. 

For q 2 <^m 2 (the kinematic regime we are interested in had better be much lower than 
our threshold of ignorance) we simply define log M 2 =^c a log m 2 in (12) and obtain 

a 


no? 2 ) 


Lf' 

7 T 2 Jo 


—- / da a(l — a) log 
27T 2 


M 2 


m 2 — a{l — a)q 2 


(13) 


Note that our heuristic argument is indeed correct. In the end, effectively we need only 
one regulator, but in the intermediate steps we needed two. Actually, this bickering over 
the number of regulators is beside the point. 

In chapter III.l I mentioned dimensional regularization as an alternative to Pauli-Villars 
regularization. Historically, dimensional regularization was invented to preserve gauge 
invariance in nonabelian gauge theories (which I will discuss in a later chapter). It is 
instructive to calculate n using dimensional regularization (exercise III.7.1). 


Electric charge 


Physically, we end up with a result for n (q 2 ) containing a parameter M 2 expressing our 
threshold of ignorance. We conclude that 


1 + (e 2 /12n 2 ) log (M 2 /m 2 ) 



(14) 


Quantum fluctuations effectively diminish the charge. I will explain the physical origin of 
this effect in a later chapter on renormalization group flow. 

You might argue that physically charge is measured by how strongly one electron scatters 
off another electron. To order e A , in addition to the diagrams in chapter II.7, we also have, 
among others, the diagrams shown in figure III.7.4a,b,c. We have computed 4a, but what 
about 4b and 4c? In many texts, it is shown that contributions of III.7.4b and III.7.4c to 
charge renormalization cancel. The advantage of using the Lagrangian in (1) is that this 
fact becomes self-evident: Charge is a measure of how the photon propagates. 

To belabor a more or less self-evident point let us imagine doing physical or renormalized 
perturbation theory as explained in chapter 111.3. The Lagrangian is written in terms of 
physical or renormalized fields (and as before we drop the subscript P on the fields) 

C = - iA„) - m P )f - -4 

p 

+ - iAJyfr + W - CF^F^ 


(15) 
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(a) 

Figure III.7.4 



(b) 




where the coefficients of the counterterms A, B , and C are determined iteratively. The 
point is that gauge invariance guarantees that V ny^d^xj/ and 1 jry^A^xlf always occur in 
the combination 1 Jsiy^id^ — iA^xjr: The strength of the coupling of A^ to i/ry^i/r cannot 
change. What can change is the ease with which the photon propagates through spacetime. 

This statement has profound physical implications. Experimentally, it is known to a 
high degree of accuracy that the charges of the electron and the proton are opposite and 
exactly equal. If the charges were not exactly equal, there would be a residual electrostatic 
force between macroscopic objects. Suppose we discovered a principle that tells us that 
the bare charges of the electron and the proton are exactly equal (indeed, as we will see, 
in grand unification theories, this fact follows from group theory). How do we know 
that quantum fluctuations would not make the charges slightly unequal? After all, the 
proton participates in the strong interaction and the electron does not and thus many 
more diagrams would contribute to the long range electromagnetic scattering between 
two protons. The discussion here makes clear that this equality will be maintained for the 
obvious reason that charge renormalization has to do with the photon. In the end, it is all 
due to gauge invariance. 


Modifying the Coulomb potential 

We have focused on charge renormalization, which is determined completely by If CO), but 
in (13) we obtained the complete function Yl(q 2 ), which tells us how the q dependence of 
the photon propagator is modified. According to the discussion in chapter 1.5, the Coulomb 
potential is just the Fourier transform of the photon propagator (see also exercise III.5.3). 
Thus, the Coulomb interaction is modified from the venerable 1 /r law at a distance scale 
of the order of (2 m) _1 , namely the inverse of the characteristic value of q in F \{q 2 ). This 
modification was experimentally verified as part of the Lamb shift in atomic spectroscopy, 
another great triumph of quantum electrodynamics. 
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Exercises 


111.7.1 Calculate Yl^ v (q) using dimensional regularization. The procedure is to start with (6), evaluate the trace 
in Npy , shift the integration momentum from p to /, and so forth, proceeding exactly as in the text, 
until you have to integrate over the loop momentum /. At that point you “pretend" that you are living 
in d-dimensional spacetime, so that the term like l^l v in N^ v , for example, is to be effectively replaced 
by {\/d)g^ v l 2 . The integration is to be performed using (III.1.15) and various generalizations thereof. 
Show that the form (3) automatically emerges when you continue to d = 4. 

111. 7.2 Study the modified Coulomb’s law as determined by the Fourier integral f d 3 q{l/q 2 [l + e 2 Tl(q 2 )]}e iqx . 




Becoming Imaginary and Conserving Probability 


When Feynman amplitudes go imaginary 

Let us admire the polarized vacuum, viz (III.7.13): 

n(<7 2 )=-^( daa(l — a) log — - A -- (1) 

Dear reader, you have come a long way in quantum field theory, to be able to calculate such 
an amazing effect. Quantum fluctuations alter the way a photon propagates! 

For a spacelike photon, with q 2 negative, n is real and positive for momentum small 
compared to the threshold of our ignorance A. For a timelike photon, we see that if q 2 > 0 
is large enough, the argument of the logarithm may go negative, and thus n becomes 
complex. As you know, the logarithmic function log z could be defined in the complex 
z plane with a cut that can be taken conventionally to go along the negative real axis, 
so that for w real and positive, log (—w ± is) — log(w) ± in [since in polar coordinates 
\o%(pe' e ) — log(p) + iO], 

We now invite ourselves to define a function in the complex plane: FI(z) = 
l/(2n 2 ) Jq daa( 1 — a) log A 2 /(m 2 — a{l — a)z ). The integrand has a cut on the positive 
real z axis extending from z = m 2 /(a(l — a)) to infinity (fig. III.8.1). Since the maximum 
value of a (1 — a) in the integration range is ^, n (z) is an analytic function in the complex 
z plane with a cut along the real axis starting at z c = 4m 2 . The integral over a smears all 
those cuts of the integrand into one single cut. 

For timelike photons with large enough q 2 , a mathematician might be paralyzed won¬ 
dering which side of the cut to go to, but we as physicists know, as per the is prescription 
from chapter 1.3, that we should approach the cut from above, namely that we should take 
I\(q 2 + is) (with s, as always, a positive infinitesimal) as the physical value. Ultimately, 
causality tells us which side of the cut we should be on. 

That the imaginary part of n starts at ^fq 2 > 2m provides a strong hint of the physics 
behind amplitudes going complex. We began the preceding chapter talking about how 
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a photon merrily propagating along could always metamorphose into a virtual electron- 
positron pair that, after a short time dictated by the uncertainly principle, annihilate each 
other to become a photon again. For ^fcp- > 2m the pair is no longer condemned to be 
virtual and to fluctuate out of existence almost immediately. The pair has enough energy 
to get real. (If you did the exercises religiously, you would recognize that these points were 
already developed in exercises 1.7.4 and III. 1.2.) 

Physically, we could argue more forcefully as follows. Imagine a gauge boson of mass M 
coupling to electrons just like the photon. (Indeed, in this book we started out supposing 
that the photon has a mass.) The vacuum polarization diagram then provides a one-loop 
correction to the vector boson propagator. For M > 2m , the vector boson becomes unstable 
against decay into an electron-positron pair. At the same time, n acquires an imaginary 
part. You might suspect that ImFI might have something to do with the decay rate. We will 
verify these suspicions later and show that, hey, your physical intuition is pretty good. 

When we ended the preceding chapter talking about the modifications to the Coulomb 
potential, we thought of a spacelilce virtual photon being exchanged between two charges as 
in electron-electron or electron-proton scattering (chapter II.6). Use crossing (chapter II.8) 
to map electron-electron scattering into electron-positron scattering. The vacuum polariza¬ 
tion diagram then appears (fig. III.8.2) as a correction to electron-positron scattering. One 
function n covers two different physical situations. 

Incidentally, the title of this section should, strictly speaking, have the word “complex,” 
but it is more dramatic to say “When Feynman amplitudes go imaginary,” if only to echo 
certain movie titles. 


Dispersion relations and high frequency behavior 

One of the most remarkable discoveries in elementary 
particle physics has been that of the existence of the 
complex plane. 

—J. Schwinger 

Considering that amplitudes are calculated in quantum field theory as integrals over 
products of propagators, it is more or less clear that amplitudes are analytic functions 
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Figure III.8.2 


of the external kinematic variables. Another example is the scattering amplitude M in 
chapter 111.1: it is manifestly an analytic function of s , t, and u with various cuts. From the 
late 1950s until the early 1960s, considerable effort was devoted to studying analyticity in 
quantum field theory, resulting in a vast literature. 

Here we merely touch upon some elementary aspects. Let us start with an embarrass¬ 
ingly simple baby example: /(z) = dal/(z — a) = log((z — l)/z). The integrand has a 
pole at z = a, which got smeared by the integral over a into a cut stretching from 0 to 1. 
At the level of physicist rigor, we may think of a cut as a lot of poles mashed together and 
a pole as an infinitesimally short cut. 

We will mostly encounter real analytic functions, namely functions satisfying /*(z) = 
f (z*) (such as log z). Furthermore, we focus on functions that have cuts along the real axis, 
as exemplified by FI(z). For the class of analytic functions specified here, the discontinuity 
of the function across the cut is given by disc f(x) = f(x + is) — f(x — is) — f(x + is) — 
f (x + is)* = 2ilmf(x + is). Define, for a real, p{u) = Im f{a + is). Using Cauchy’s 
theorem with a contour C that goes around the cut as indicated in figure 111.8.3, we could 
write 


/(*) = 


dz' f(z') 
2:ri z' — z 


( 2 ) 


Assuming that f(z) vanishes faster than 1/z as z —oo, we can drop the contribution from 
infinity and write 


f(z) = - 
n 


If 


dcr 


p(ff) 


(3) 


where the integral ranges over the cut. Note that we can check this equation using the 
identity (1.2.14): 


Im f(x + is) = — 
7r 


if 


dap(a) Im 


;/■ 


-^ = — / dap(a)jt8(a — x) 

a — x — is Tt 
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Figure III.8.3 


This relation tells us that knowing the imaginary part of / along the cut allows us to 
construct / in its entirety, including a fortiori its real part on and away from the cut. 
Relations of this type, known collectively as dispersion relations, go back at least to the 
work of Kramers and Kronig on optics and are enormously useful in many areas of physics. 
We will use it in, for example, chapter VII.4. 

We implicitly assumed that the integral over a converges. Ifnot, we can always (formally) 
subtract /(0) — ^ f dop(o)/o from /(z) as given above and write 

/(z) = /(O) + - [ do P(g) (4) 

n J a(a — z) 

The integral over a now enjoys an additional factor of 1/er and hence is more convergent. 
In this case, to reconstruct /(z), we need, in addition to knowledge of the imaginary part 
of / on the cut, an unknown constant /(0). Evidently, we could repeat this process until 
we obtain a convergent integral. 

A bell rings, and you, the astute reader, see the connection with the renormalization 
procedure of introducing counterterms. In the dispersion Weltanschauung, divergent 
Feynman integrals correspond to integrals over o that do not converge. Once again, 
divergent integrals do not bend real physicists out of shape: we simply admit to ignorance 
of the high a regime. 

During the height of the dispersion program, it was jokingly said that particle theorists 
either group or disperse, depending on whether you like group theory or complex analysis 
better. 
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Imaginary part of Feynman integrals 


Going back to the calculation of vacuum polarization in the preceding chapter, we see 
that the numerator N flv , which comes from the spin of the photon and of the electron, 
is irrelevant in determining the analytic structure of the Feynman diagram. It is the 
denominator D that counts. Thus, to get at a conceptual understanding of analyticity 
in quantum field theory, we could dispense with spins and study the analog of vacuum 
polarization in the scalar field theory with the interaction term C — g{ryS,'(p + h.c.), 
introduced in the appendix to chapter II.6. The ip propagator is corrected by the analog 
of the diagrams in figure III.7.1 to [compare with (4)] 


iD F (q) = 


q 2 — M 2 + ie q 2 — M 2 + ie 
i 

q 2 — M 2 + n (q 2 ) + ie 


in(q 2 ) 


q 2 — M 2 + ie 


To order g 2 we have 


(5) 


iri(q 2 ) = z' 4 g 2 J 


d 4 k 

(27T) 4 k 2 


1 1 

ju 2 + is (q — k ) 2 — m 2 + ie 


( 6 ) 


As in the preceding chapter we need to regulate the integral, but we will leave that implicit. 

Having practiced with the spinful calculation of the preceding chapter, you can now 
whiz through this spinless calculation and obtain 


n(z) = 


167T 2 Jo 


da log 


A 2 


am 2 + (1 — a)ii 2 — a( 1 — a)z 


(7) 


with A some cutoff. (Please do whiz and not imagine that you could whiz.) We use the 
same Greek letter n and allow the two particles in the loop to have different masses, in 
contrast to the situation in quantum electrodynamics. 

As before, for z real and negative, the argument of the log is real and positive, and n is 
real. By the same token, for z real and positive enough, the argument of the log becomes 
negative for some value of a, and FI(z) goes complex. Indeed, 


ImFhcr + is) — — 




/' 


16jr 2 Jo 

2 ro/ + 

16 71 J a 
2 


da(—n)8[a(\ — a)a — am 2 — (1 — a)fj, 2 ] 


8 


da 


= -y- — y/{a - (m + /r) 2 )(o- - (m - /i) 2 ) 
lbjrcr 


( 8 ) 


with a ± the two roots of the quadratic equation obtained by setting the argument of the 
step function to zero. 
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Decay and distintegration 

At this point you might already be flipping back to the expression given in chapter II.6 for 
the decay rate of a particle. Earlier we entertained the suspicion that the imaginary part of 
Id (z) corresponds to decay. To confirm our suspicion, let us first go back to elementary 
quantum mechanics. The higher energy levels in a hydrogen atom, say, are, strictly 
speaking, not eigenstates of the Hamiltonian: an electron in a higher energy level will, in a 
finite time, emit a photon and jump to a lower energy level. Phenomenologically, however, 
the level could be assigned a complex energy E — i\T. The probability of staying in this 
level then goes with time like \ir{t)\ 2 oc |e _ ' (£_! 2 r ^| 2 = e~ rt . (Note that in elementary 
quantum mechanics, the Coulomb and radiation components of the electromagnetic field 
are treated separately: the former is included in the Schrodinger equation but not the latter. 
One of the aims of quantum field theory is to remedy this artificial split.) 

We now go back to (5) and field theory: note that n (q 2 ) effectively shifts M 2 —> M 2 — 
n(q 2 ). Recall from (III.3.3) that we have counter terms available to, well, counter two 
cutoff-dependent pieces of W(q 2 ). But we have nothing to counter the imaginary part of 
n(q 2 ) with, and so it better be cutoff independent. Indeed it is! The cutoff only appears in 
the real part in (7). 

We conclude that the effect of n going imaginary is to shift the mass of the (p meson 
by a cutoff-independent amount from M to yj M 2 — /ImflfM 2 ) - ;Tmn(M 2 )/(2M). 
Note that to order g 2 it suffices to evaluate If at the unshifted mass squared M 2 , since 
the shift in mass is itself of order g 2 . Thus T = lmYl(M 2 ) / M gives the decay rate, as we 
suspected. We obtain (g has dimension of mass and so the dimension is correct) 

r = iSaT ^ [m2 ~ {m + m)2][m2 “ (m ~ m)2] (9) 

precisely what we had in (II.6.7). You and I could both take a bow for getting all the factors 
exactly right! 

Note that both the treatment given in elementary quantum mechanics and here are in 
the spirit of treating the decay as a small perturbation. As the width becomes large, at some 
point it no longer makes good sense to talk of the field associated with the particle (p. 


Taking the imaginary part directly 


We ought to be able to take the imaginary part of the Feynman integral in (6) directly, rather 
than having to first calculate it as an integral over the Feynman parameter a. I will now 
show you how to do this using a trick. For clarity and convenience, change notation from 
(6), label the momentum carried by the two internal lines in figure III.8.4a separately, and 
restore the momentum conservation delta function, so that 


7 nc<?) = ( ig) 2 i 2 


/ 


d\ d% 

(2tt) 4 (27T) 4 


(2jt) 4 S 4 (k tl + k f - q) 


k 2 - 7H 2 + ie - ml + ie 


( 10 ) 
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Figure III.8.4 


Write the propagator as 1 /(k 2 — m 2 + ie) = V(l/(k 2 — m 2 )) — inS(k 2 — m 2 ) and, noting 
an explicit overall factor of i, take the real part of the curly bracket above, thus obtaining 

Imn( ? ) = -g 2 J - A„A { ) (11) 

For the sake of compactness, we have introduced the notation 


d<t> = 


d A k v d A k^ 
(2jt) 4 (2tt) 4 


(In ) 4 S\k, t + k s -q), 


V 



A,, = nS(k 2 — m 2 ) 

rj v t] ry 


and so on. 

We welcome the product of two delta functions; they are what we want, restricting the 
two particles ri and £, on shell. But yuclc, what do we do with the product of the two principal 
values? They don’t correspond to anything too physical that we know of. 

To get rid of the two principal values, we use a trick. 1 First, we regress and recall that 
we started out with Feynman diagrams as spacetime diagrams (for example, fig. 1.7.6) of 
the process under study. Here (fig. III.8.4b) a tp excitation turns into an rj and a £ with 
amplitude ig at some spacetime point, which by translation invariance we could take to be 
the origin; the i] and the £ excitations propagate to some point x with amplitude iD tl (x) 
and iD:(x), respectively, and then recombine into cp with amplitude ig (note: not — ig). 
Fourier transforming this product of spacetime amplitudes gives 


iU(q) = (igfi 2 


/ 


dxe ,qx D rj (x)D^(x) 


( 12 ) 


To see that this is indeed the same as (10), all you have to do is to plug in the expression 
(1.3.22) for D n (x) and D^(x). 

Incidentally, while many “professors of Feynman diagrams” think almost exclusively 
in momentum space, Feynman titled his 1949 paper “Space-Time Approach to Quantum 
Electrodynamics,” and on occasions it is useful to think of the spacetime roots of a given 
Feynman diagram. Now is one of those occasions. 


1 C. Itzkyson and J.-B. Zuber, Quantum Field Theory, p. 367. 
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Next, go back to exercise 1.3.3 and recall that the advanced propagator D adv (.r) and 
retarded propagator D let (x) vanish for x° < Oand.r 0 > 0, respectively, and thus the product 
D ac i v (k)D re t (k) manifestly vanishes for all x. Also recall that the advanced and retarded 
propagators D^(k) and D let (k) differ from the Feynman propagator D{k) by simply, but 
crucially, having their poles in different half-planes in the complex k° plane. Thus 


0=-ig 2 J dxe ,qx D n .advW^retW 
d 4 k„ d 4 k f 


-I 


(2n) 4 S 4 (k, 1 + k^~ q) 


( 2 n ) 4 (2tt) 4 


k 2 — m 2 — ia e kl — ml + j'o>e 


(13) 


where we used the shorthand er^ = sgn(k®) and er= = sgn(k?). (The sign function is defined 
by sgn(x) = ±1 according to whether x > 0 or < 0.) Taking the imaginary part of 0, we 
obtain [compare (11)] 


0 =~8 2 f + <yr { A„A { ) (14) 

Subtracting (14) from (11) to get rid of the rather unpleasant term V tl V^, we find finally 


\mU(q) = +g 2 j d^(l + a rl a i )(A n A t ) 


2 2 
= g n 


/ 


d 4 k n d 4 k 
(2tt) 4 (2?r) 4 


(2n) 4 S 4 (k rl + k s 


q)9(k°)S(k 2 i - m 2 )0(k^)S{k 2 - m 2 )(1 + <yr 4 ) 

(15) 


Thus n {q) develops an imaginary part only when the three delta functions can be satisfied 
simultaneously. 

To see what these three conditions imply, we can, since n (q) is a function of q 2 , go 
to a frame in which q = (Q, 0) with Q > 0 with no loss of generality. Since k® + k® — 
Q > 0 and since (1 + er^) vanishes unless k® and k® have the same sign, k® and k? 
must be both positive if ImFI(g) is to be nonzero, but that is already mandated by the 
two step functions. Furthermore, we need to solve the conservation of energy condition 
Q = JP + m + ml for some 3-vector k. This is possible only if Q > m n + m?, in 

which case, using the identity (1.8.14) 


e(k 0 )S(k 2 -q 2 ) = e(k 0 ) S(k °~ Ek) , 

2e k 


(16) 


we obtain 


Imn(^) = 


H* 1 / 


d 2 k, } d 2 k^ 
(2rc) i 2cL> rl (2n) 2 2a>£ 


(2n) 4 S 4 (k, l + k£-q) 


(17) 


We see that (and as we will see more generally) ImFUg) works out to be a finite integral 
over delta functions. Indeed, no counter term is needed. 

Some readers might feel that this trick of invoking the advanced and retarded propa¬ 
gators is perhaps a bit “too tricky.” For them, I will show a more brute force method in 
appendix 1. 
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Unitarity and the Cutkosky cutting rule 

The simple example we just went through in detail illustrates what is known as the 
Cutkosky cutting rule, which states that to calculate the imaginary part of a Feynman 
amplitude we first “cut” through a diagram (as indicated by the dotted line in figure II.8.4a). 
For each internal line cut, replace the propagator 1 /(k 2 — m 2 + ie) by S(k 2 — m 2 ); in other 
words, put the virtual excitation propagating through the cut onto the mass shell. Thus, 
in our example, we could jump from (10) to (15) directly. This validates our intuition that 
Feynman amplitudes go imaginary when virtual particles can “get real.” 

For a precise statement of the cutting rule, see below. (The Cutkosky cut is not be 
confused with the Cauchy cut in the complex plane, of course.) 

The Cutkosky cutting rule in fact follows in all generality from unitarity. A basic postulate 
of quantum mechanics is that the time evolution operator e~ lHT is unitary and hence 
preserves probability. Recall from chapter 1.8 that it is convenient to split off from the 
S matrix Sf, — (/1 e~' HT |;) the piece corresponding to “nothing is happening”: S = I + 
i T. Unitarity S f S — I then implies 2 Im T = i (T' f — T ) = T' f T. Sandwiching this between 
initial and final states and inserting a complete set of intermediate states (1 = \ri) (n \) 
we have 

21 m T f i = J2 T f n T n i < 18 > 

n 

which some readers might recognize as a generalization of the optical theorem from 
elementary quantum mechanics. 

It is convenient to introduce T — —i AT (We are merely taking out an explicit factor of 
i in AT in our simple example, A4 corresponds to / If, T to n.) Then the relation (1.8.16) 
between T and M becomes T fi = (In) 4 ^(P (n f t \/p)T (/ <- i), where for the sake 
of compactness we have introduced some obvious notations [thus (11^, ^) denotes the 
product of the normalization factors 1/p (see chapter 1.8), one for each of the particle in 
the state i and in the state /, and P r i the sum of the momenta in / minus the sum of the 
momenta in i .] 

With this notation, the left-hand side of the generalized optical theorem becomes 
2ImT^ ; - = 2(2jr) 4 (5 (4, (F , f ; )(ny / l/p)Im T(f i) and the right-hand side 

E T k T ni - E( 2j r) 45<4) ( p /n)( 27r ) 45<4) (- p »<) ( n /»“) ( n '"'“) {:F(n <- 0 

The product of two delta functions S^ 4 \Pf„)S (4 \P ni ) — S^ 4 \Pfi)S^ 4 \P n i), and thus we 
could cancel off 8 (4 HPfi). Also (n / -, ! l/p)(n„ ) l/p)/(r[y,l/p) = (11,,1/p 2 ), and we happily 
recover the more familiar factor p 2 [namely (27 t) 3 2&> for bosons]. Thus finally, the gener¬ 
alized optical theorem tells that 



(P(n •«- <r- i). 


21m 4 - I) = J2(2n) 4 S^ 4 \P ni ) 


(19) 
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namely, that the imaginary part of the Feynman amplitude T(f •<— i) is given by a sum 
of (T(n <— f))*T(n <- i ) over intermediate states \n). The particles in the intermediate 
state are of course physical and on shell. We are to sum over all possible states \ri) allowed 
by quantum numbers and by the kinematics. 

According to Cutkoslcy, given a Feynman diagram, to obtain its imaginary part, we 
simply cut through it in the several different ways allowed, corresponding to the different 
possible intermediate state | n). Particles in the state \n) are manifestly real, not virtual. 

Note that unitarity and hence the optical theorem are nonlinear in the transition am¬ 
plitude. This has proved to be enormously useful in actual computation. Suppose we are 
perturbing in some coupling g and we know T (n •<— i) and T (n <— /) to order g N . The op¬ 
tical theorem gives us Im <—i) to order g N+1 , and we could then construct T(f i) 
to order using a dispersion relation. 

The application of the Cutkoslcy rule to the vacuum polarization function discussed in 
this chapter is particularly simple: there is only one possible way of cutting the Feynman 
diagram for n. Here the initial and final states | i) and | /> both consist of a single (p meson, 
while the intermediate state | n > consists of an ;/ and a £ meson. 

d^k d^kz 

Referring to the appendix to chapter 11.6, we recall that corresponds to f 
Thus the optical theorem as stated in (19) says that 


Imn( ? ) = ^g 2 j 


d 3 k n d?k£ 

(2rt) 3 2cL> r (2n) 3 2a>^ 


(2n) 4 S 4 (k 11 + k s -q) 


( 20 ) 


precisely what we obtained in (17). You and I could take another bow, since we even get 
the factor of 2 correctly (as we must!). 

We should also say, in concluding this chapter, “Vive Cauchy!” 


Appendix i: Taking the imaginary part by brute force 


For those readers who like brute force, we will extract the imaginary part of 


f d 4 k 1 _ 1 

J (2n) 4 k 2 — /x 2 + is (q — k ) 2 — m 2 + is 


( 21 ) 


by a more staightforward method, as promised in the text. Since n depends only on q 2 , we have the luxury 
of setting q = (M, 0). We already know that in the complex M 2 plane, n has a cut on the axis starting at 
M 1 = (m + /x) 2 . Let us verify this by brute force. 

We could restrict ourselves to M > 0. Factorizing, we find that the denominator of the integrand is a product of 
four factors, k° — ( e ^ — ie), k° + ( e ^ — is), k° — (M + E k — is), and k° — (M — E k + is), and thus the integrand 

has four poles in the complex k° plane. (Evidently, s k = Jk 2 + /x 2 , E k = •Jk 2 + m 2 , and if you are perplexed over 
the difference between s^ and s, then you are hopelessly confused.) We now integrate over k°, choosing to close 
the contour in the lower half plane and going around picking up poles. Picking up the pole at s k — is, we obtain 
rij = —g 2 f (d 3 A:/(27r) 3 )(l/(2£^(£^ — M — E] c ){sj c — M + £*;))). Picking up the pole at M + — is, we obtain 

n 2 = -S 2 f (d 2 k/ (2rr) 3 )(1/((M + E k - s k )(M + E k + e k )(2E k ))). We now regard n = Yl, + U 2 as a function of 
M : 


n -* 2 / 


d 3 k 


1 


(2;r) 3 (M + E k — e k ) \_2s k (M — e k — E k + is) 2E k (M + E k + s k — is) J 


( 22 ) 
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In spite of appearances, there is no pole at M ^ s k — E k . (Since this pole would lead to a cut at /x — m, there better 
not be!) For M > 0 we only care about the pole at M ^ s k + E k = Jk 2 + P + s/P + m 2 . When we integrate over 
k, this pole gets smeared into a cut starting at m + /z. So far so good. 

To calculate the discontinuity across the cut, we use the identity (16) once again Restoring the is’s and throwing 
away the term we don't care about, we have effectively 


n = -* 


2 


/ 


d 3 k _ 1 _ 

(2tt) 3 2 s k (e k — M — E k + is){s k — M + E k — is) 


= -Ing' 


/ 


(2^F 


0(k°)8(k 2 - p 1 ) 


1 


(A/ — T- E^ — is)(Ad — (£^ -T E^) -\- is ) 


The discontinuity of n across the cut just specified is determined by applying (1.2.13) to the factor 
l/(Af — (s k + E k ) + is), giving Imfl = 2n 2 g 2 f (d 4 k/(2n) 4 )6(k°)8(k 2 — /x 2 )8(M — (s k + E k ))/2E k . Use the 
identity (16) again in the form 

6(q° - k°)e({q - k ) 2 - m 2 ) = 9(q° - k°) 9 ^ q ~ k ) ~ E k) ^ 

2E k 


and we obtain 

Imfl = 2n 2 g 2 j ^L0(k°)8(k 2 - fi 2 )9(q° 


k°)8((q - k) 2 - m 2 ) 


(24) 


Remarkably, as Cutkosky taught us, to obtain the imaginary part we simply replace the propagators in (21) by 
delta functions. 


Appendix 2: A dispersion representation for the two-point amplitude 


I would like to give you a bit more flavor of the dispersion program once active and now being revived (see 
partN). Consider the two-point amplitude iV{x) = (0| T(0{x)0{ 0)) |0), with(9(x) some operator in the canonical 
formalism. For example, for 0(x) equal to the field <p(x), T>{x) would be the propagator. In chapter 1.8, we were 
able to evaluate T){x) for a free field theory, because then we could solve the field equation of motion and expand 
(p{x) in terms of creation and annihilation operators. But what can we do in a fully interacting field theory? There 
is no hope of solving the operator field equation of motion. 

The goal of the dispersion program of the 1950s and 1960s is to say as much as possible about T)(x) based on 
general considerations such as analyticity. 

OK, so first write iV(x) = 6(x°){0\e lPx O(0)e~ lPx O(0)\0) + 6(— x°)(0| O(0)e lPx O(0)e~ lPx |0), where we 
used spacetime translation G(x) = e lP ‘ x O(0)e~ lP ‘ x . By the way, if you are not totally sure of this relation, dif¬ 
ferentiate it to obtain d^O(x) = i[P /x , 0(x)] which you should recognize as the relativistic version of the usual 
Heisenberg equations (1.8.2, 3). Now insert 1 = \n){n \, with | n) a complete set of intermediate states, to obtain 

(0| e iPx O(0)e~ iPx O(0) |0) = (0| O(0)e~ iPx O(0 ) |0) = £ n (0| 0(0) \n)(n\e~ iPx O(0) |0) = T,n e ~ iPnX \°0n\ 2 > where 
we used P^ |0) = 0 and | n) = Pjf \n) and defined O 0n = (0| 0(0) \n). Next, use the integral representations 
for the step function 6(t) = —i f (dco/27z)e ia)t /(co — is) and 6(—t) = i f (dco/2n)e l(0t /(co + is). Again, if you are 
not sure of this, simply differentiate ^ 6(t ) = — i ^ f (dco/2n)e l(0t /(co — is) = f (dco/2n)e l(0t , which you recog¬ 
nize from (1.2.12) as indeed the integral representation of the delta function 8(t) = ^6(t). In other words, the 
representation used here is the integral of the representation in (1.2.12). 

Putting it all together, we obtain 



d A xe iq ' x iV(x) = — /(2jr) 3 y^ \Oq h \ 2 


8^(q - P„) 8W(q + P„) 1 

P 0 _ q0 _ ie + pO + q 0_ ie j 


(25) 


The integral over d*x produced the 3-dimensional delta function, while the integral over dx° = dt picked up the 
denominator in the integral representation for the step function. 

Now take the imaginary part using Iml/(P® — q° — is) = n8(q° — P®). We thus obtain 


Irn (i f d\e iqx { 0| T(O(x)O(0)) |0)) = n(2n) i J2„\ C, 0n\ 2 (S i4 \q - P n ) + 8 w (q + P,,)) 


(26) 
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with the more pleasing 4-dimensional delta function. Note that for q° > 0 the term involving 8^ 4 \q + P n ) drops 
out, since the energies of physical states must be positive. 

What have we accomplished? Even though we are totally incapable of calculating T>(q), we have managed to 
represent its imaginary part in terms of physical quantities that are measurable in principle, namely the absolute 
square \Oq h \ 2 of the matrix element of (9(0) between the vacuum state and the state | n). For example, if 0(x) is 
the meson field (p{x ) in a (p 4 theory, the state | n) would consist of the single-meson state, the three-meson state, 
and so on. The general hope during the dispersion era was that by keeping a few states we could obtain a decent 
approximation to T>(q). Note that the result does not depend on perturbing in some coupling constant. 

The contribution of the single meson state \k) has a particularly simple form, as you might expect. 
With our normalization of single-particle states (as in chapter 1.8), Lorentz invariance implies (k\ (9(0) |0) = 
Z 2 fyj with cojt- = y/k 2 + m 2 and Z 2 an unknown constant, measuring the “strength” with which O is 
capable of producing the single meson from the vacuum. Putting this into (25) and recognizing that the sum 
over single-meson states is now given by f d?k \k)(k\ [with the normalization (k'\ k) = 8^{k' — k)], we find that 
the single-meson contribution to iT>(q) is given by 


-!(27T) 3 


/ 


d 3 k 


Z 

(27r) 3 2co£ 


[ S^(q-k) 

I <54 — q° — is 


+ (<? 



2a>, 


q yU q -cp -is 
iZ 


■ + (9 ■ 


q 2 — m 2 + is 


-q) 

(27) 


This is a very satisfying result: even though we cannot calculate iT>(q), we know that it has a pole at a position 
determined by the meson mass with a residue that depends on how O is normalized. 

As a check, we can also easily calculate the contribution of the single-meson state to — Im D(q). Plugging 
into (26), we find, for q° > 0, nZ f (< d 2> k/2o)] c )8 4 {q — k) = (nZ/2co q )8(q° — co q ) = nZ8{q 2 — m 2 ), where we used 
(1.8.14, 16) in the last step. 

Given our experience with the vacuum polarization function, we would expect T>(q) (which by Lorentz 
invariance is a function of q 2 ) to have a cut starting at q 2 = (3m) 2 . To verify this, simply look at (26) and choose q = 

0. The contribution of the three-meson state occurs at yfq 2 = q° = P^„ = yjk 2 + m 2 + yjk 2 + m 2 + yjk 2 + m 2 > 
3m. The sum over states is now a triple integration over ky k 2 , and k 3 , subject to the constraint + k 2 + £3 = 0. 
Knowing the imaginary part of V(q) we can now write a dispersion relation of the kind in (3). 

Finally, if you stare at (26) long enough (see exercise 111. 8 .3) you will discover the relation 


of 


Im(( / d 4 x e iqx {0\T(0(x)0(0))\0))= - / d 4 x e iqx {0\ [0(x), 0(0)] |0) 


If' 


(28) 


The discussion here is relevant to the discussion of field redefinition in chapter 1.8. Suppose our friend uses 
q= Z^(p + oup* instead of (p; then the present discussion shows that his propagator f d 4 x e iqx (0| T(q(x)q(0)) |0) 
still has a pole at q 2 — m 2 . The important point is that physics fixes the pole to be at the same location. 

Here we have taken O to be a Lorentz scalar. In applications (see chapter VII.3) the role of O is often played 
by the electromagnetic current J^ix) (treated as an operator). The same discussion holds except that we have 
to keep track of some Lorentz indices. Indeed, we recognize that the vacuum polarization function nthen 
corresponds to the function V in this discussion. 


Exercises 


111.8.1 Evaluate the imaginary part of the vacuum polarization function, and by explicit calculation verify that it 
is related to the decay rate of a vector particle into an electron and a positron. 

111.8.2 Suppose we add a term g(p 3 to our scalar (p 4 theory. Show that to order g 4 there is a “box diagram” 
contributing to meson scattering Pi + P 2 ► Pi + Pa with the amplitude 

x= g *[ _I_ 

J (2n) 4 ( k 2 — m 2 — is)((k + p 2 ) 2 — m 2 — is)((k — pi) 2 — m 2 — is)((k + p 2 — P 3) 2 — m 2 — is) 
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Calculate the integral explicitly as a function of s = (/ 7 1 + p 2 ) 2 and t = (p 3 — p 2 ) 2 . Study the analyticity 
property of X as a function of s for fixed t. Evaluate the discontinuity of X across the cut and verify 
Cutkosky’s cutting rule. Check that the optical theorem works. What about the analyticity property of X 
as a function of t for fixed s? And as a function of u = (p 3 — pi) 2 ? 

III. 8.3 Prove (28). [Hint: Do unto f d A x e iqx (0\ [0(x), (9(0)] |0) what we did to f d 4 x e iqx (0\ |X((9(x)(9(0)) |0), 
namely, insert 1 = \n)(n\ (with | n) a complete set of states) between G{x) and 0(0) in the commu¬ 

tator. Now we don’t have to bother with representing the step function.] 
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Symmetry Breaking 


A symmetric world would be dull 

While we would like to believe that the fundamental laws of Nature are symmetric, 
a completely symmetric world would be rather dull, and as a matter of fact, the real 
world is not perfectly symmetric. More precisely, we want the Lagrangian, but not the 
world described by the Lagrangian, to be symmetric. Indeed, a central theme of modern 
physics is the study of how symmetries of the Lagrangian can be broken. We will see in 
subsequent chapters that our present understanding of the fundamental laws is built upon 
an understanding of symmetry breaking. 

Consider the Lagrangian studied in chapter 1.10: 

c = \ - mV] - ^(v 2 ) 2 (1) 

where ip — (<pi , (p 2 , ■ ■ ■, <Pn)- This Lagrangian exhibits an O(N) symmetry under which ip 
transforms as an N -component vector. 

We can easily add terms that do not respect the symmetry. For instance, add terms 
such as <p 2 , <p* and (pfip 2 and break the O(N) symmetry down to 0(N — 1), under which 
q> 2 , ■ ■ ■ , <Pn rotate as an (N — 1)-component vector. This way of breaking the symmetry, 
“by hand” as it were, is known as explicit breaking. 

We can break the symmetry in stages. Obviously, if we want to, we can break it down to 
0(N — M) by hand, for any M < N. 

Note that in this example, with the terms we added, the reflection symmetry (p a —> —q> a 
(any a) still holds. It is easy enough to break this symmetry as well, by adding a term such 
as </rj, for example. 

Breaking the symmetry by hand is not very interesting. Indeed, we might as well start 
with a nonsymmetric Lagrangian in the first place. 
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Figure IV.1.1 


Spontaneous symmetry breaking 

A more subtle and interesting way is to let the system “break the symmetry itself,” a 
phenomenon known as spontaneous symmetry breaking. I will explain by way of an 
example. Let us flip the sign of the ip 2 term in (1) and write 

C '= \ [( 9 ^ )2 + “ ^(^ 2 ) 2 ( 2 ) 

Naively, we would conclude that for small X the field <p creates a particle of mass \f-~jx 2 — 
i pi. Something is obviously wrong. 

The essential physics is analogous to what would happen if we give the spring constant 
in an anharmonic oscillator the wrong sign and write L — \{q 2 + kq 2 ) — (X/4)q 4 . We all 
know what to do in classical mechanics. The potential energy V(q) = —\kq 2 + (X/4)q 4 
[known as the double-well potential (figure IV.1.1)] has two minima at q — ±i>, where 
v = (k/X) 2 . At low energies, we choose either one of the two minima and study small 
oscillations around that minimum. Committing to one or the other of the two minima 
breaks the reflection symmetry q -> —q of the system. 

In quantum mechanics, however, the particle can tunnel between the two minima, the 
tunneling barrier being V(0) — V(±i>). The probability of being in one or the other of 
the two minima must be equal, thus respecting the reflection symmetry q -» —q of the 
Hamiltonian. In particular, the ground state wave function — i jf(—q) is even. 

Let us try to extend the same reasoning to quantum field theory. For a generic scalar 
field Lagrangian C — \ (9 0 <p) 2 — \ Oi<p) 2 — V (up) we again have to find the minimum of the 
potential energy f d D x[\{dj(p) 2 + V where D is the dimension of space. Clearly, any 
spatial variation in cp only increases the energy, and so we set cp(x) to equal a spacetime 
independent quantity <p and look for the minimum of V(cp). In particular, for the example 
in (2), we have 

k(<p) = -^M 2 < P 2 + t OP 2 ) 2 
2 4 

As we will see, the N = 1 case is dramatically different from the N >2 cases. 


( 3 ) 
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Difference between quantum mechanics and quantum field theory 


Study the N — 1 case first. The potential V (cp) looks exactly the same as the potential in 
figure IV. 1.1 with the horizontal axis relabeled as (p. There are two minima at <p = ±i> = 

±(m 2 A) 2. 

But some thought reveals a crucial difference between quantum field theory and quan¬ 
tum mechanics. The tunneling barrier is now [V(0) — V(±t>)] f d D x (where D denotes 
the dimension of space) and hence infinite (or more precisely, extensive with the volume of 
the system)! Tunneling is shut down, and the ground state wave function is concentrated 
around either +u or —v. We have to commit to one or the other of the two possibilities 
for the ground state and build perturbation theory around it. It does not matter which 
one we choose: The physics is equivalent. But by making a choice, we break the reflection 
symmetry q> — >• —cp of the Lagrangian. 

The reflection symmetry is broken spontaneously! We did not put symmetry breaking 
terms into the Lagrangian by hand but yet the reflection symmetry is broken. 1 

Let’s choose the ground state at +u and write <p — v + <p'. Expanding in <p r we find after 
a bit of arithmetic that 

+ ^(V) 2 -/A> ,2 -0(/ 3 ) (4) 

4X 2 


The physical particle created by the shifted field <p' has mass s/lpu. The physical mass 
squared has to come out positive since, after all, it is just —V"(<p) L=„, as you can see after 
a moment’s thought. 

Similarly, you would recognize that the first term in (4) is just —V(<p) L =l) . If we are only 
interested in the scattering of the mesons associated with <p' this term does not enter at all. 
Indeed, we are always free to add an arbitrary constant to C to begin with. We had quite 
arbitrarily set V(ip = 0) equal to 0. The same situation appears in quantum mechanics: In 
the discussion of the harmonic oscillator the zero point energy \hw is not observable; only 
transitions between energy levels are physical. We will return to this point in chapter VIII .2. 

Yet another way of looking at (2) is that quantum field theory amounts to doing the 
Euclidean functional integral 


■ = J D(pe -f <i d x{\[(d ?) 2 -/fy]+|(? 2 ) 2 ) 


and perturbation theory just corresponds to studying the small oscillations around a 
minimum of the Euclidean action. Normally, with fj. 2 positive, we expand around the 
minimum <p = 0. With pi 2 negative, (p — 0 is a local maximum and not a minimum. 

In quantum field theory what is called the ground state is also known as the vacuum, 
since it is literally the state in which the field is “at rest,” with no particles present. Here we 


1 An insignificant technical aside for the nitpickers: Strictly speaking, in field theory the ground state wave 
function should be called a wave functional, since 'I'[qp(ic)] is a functional of the function <p(x). 
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have two physically equivalent vacua from which we are to choose one. The value assumed 
by <p in the ground state, either c or — v in our example, is known as the vacuum expectation 
value of cp. The field <p is said to have acquired a vacuum expectation value. 


Continuous symmetry 

Let us now turn to (2) with N >2. The potential (3) is shown in figure IV.1.2 for N — 2. The 
shape of the potential has been variously compared to the bottom of a punted wine bottle 
or a Mexican hat. The potential is minimized at <p 2 = Something interesting is going 
on: We have an infinite number of vacua characterized by the direction of (p in that vacuum. 
Because of the 0(2) symmetry of the Lagrangian they are all physically equivalent. The 
result had better not depend on our choice. So let us choose q> to point in the 1 direction, 
that is, <Pi = v = +yq 2 A an d <P 2 — 0- 

Now consider fluctuations around this field configuration, in other words, write <p\ — 
v + <p[ and <P 2 — <p 2 , plug into (2) for N — 2, and expand C out. I invite you to do the 
arithmetic. You should find (after dropping the primes on the fields; why clutter the 
notation, right?) 

4 i 

£ = ~ + “ [f3^l) 2 + (3^2) 2 ] — A ,2 (f ) x + fhy’ 3 ) (5) 

The constant term is exactly as in (4), and just like the field <p' in (4), the field (p\ has mass 
V2/x. But now note the remarkable feature of (5): the absence of a (p 2 term. The field cp 2 is 
massless! 


Emergence of massless boson 

That <p 2 comes out massless is not an accident. I will now explain that the masslessness is 
a general and exact phenomenon. 

Referring back to figure IV. 1.2 we can easily understand the particle spectrum. Excitation 
in the <p\ field corresponds to fluctuation in the radial direction, “climbing the wall” so to 
speak, while excitation in the <p 2 field corresponds to fluctuation in the angular direction, 
“rolling along the gutter” so to speak. It costs no energy for a marble to roll along the 
minima of the potential energy, going from one minimum to another. Another way of 
saying this is to picture a long wavelength excitation of the form <p 2 — a sin(&>f — kx) with 
a small. In a region of length scale small compared to |k| _1 , the field <p 2 is essentially 
constant and thus the field <p is just rotated slightly away from the 1 direction, which by 
the 0(2) symmetry is equivalent to the vacuum. It is only when we look at regions of 
length scale large compared to |k| -1 that we realize that the excitation costs energy. Thus, 
as | A: | -> 0, we expect the energy of the excitation to vanish. 

We now understand the crucial difference between the N — 1 and the N — 2 cases: In 
the former we have a reflection symmetry, which is discrete, while in the latter we have an 
0(2) symmetry, which is continuous. 
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We have worked out the N = 2 case in detail. You should now be able to generalize our 
discussion to arbitrary N >2 (see exercise IV. 1.1). 

Meanwhile, it is worth looking at N = 2 from another point of view. Many field theories 
can be written in more than one form and it is important to know them under different 
guises. Construct the complex field <p = (1/V2.)((p 1 + i<p 2 )] we have (p~ f (p — \(cp 2 + <p 2 ) and 
so can write (2) as 

C = d<p + ii 2 ip 1 y — A(<p V) 2 (6) 

which is manifestly invariant under the U (1) transformation^ —> e l0l (p (recall chapter 1.10). 
You may recognize that this amounts to saying that the groups 0(2) and 17(1) are locally 
isomorphic. Just as we can write a vector in Cartesian or polar coordinates we are free 
to parametrize the field by <p(x) — p(x)e ' 0(x 1 (as in chapter III.5) so that = (3 ;/ p + 
ipd^e 10 . We obtain C — p 2 (d0) 2 + (dp) 2 + p 2 p 2 — Xp 4 . Spontaneous symmetry breaking 
means setting p = v + x with v — +^fi 2 /2k, whereupon 


C = v 2 (d9) 2 + 



+ 



(7) 


We recognize the phase 6(x) as the massless field. We have arranged the terms in the 
Lagrangian in three groups: the kinetic energy of the massless field 6, the kinetic and 
potential energy of the massive field x > and the interaction between 6 and /. (The additive 
constant in (5) has been dropped to minimize clutter.) 
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Goldstone’s theorem 

We will now prove Goldstone’s theorem, which states that whenever a continuous sym¬ 
metry is spontaneously broken, massless fields, known as Nambu 2 -Goldstone bosons, 
emerge. 

Recall that associated with every continuous symmetry is a conserved charge Q. That Q 
generates a symmetry is stated as 

[H,Q\ = 0 (8) 

Let the vacuum (or ground state in quantum mechanics) be denoted by |0). By adding 
an appropriate constant to the Hamiltonian H -> H + c we can always write H |0) = 0. 
Normally, the vacuum is invariant under the symmetry transformation, e' 6 @ |0) = |0), or 
in other words Q |0) = 0. 

But suppose the symmetry is spontaneously broken, so that the vacuum is not invariant 
under the symmetry transformation; in other words, Q |0) ^ 0. Consider the state Q |0). 
What is its energy? Well, 

HQ\0) = [H,Q}\0)=0 (9) 

[The first equality follows from H |0) = 0 and the second from (8).] Thus, we have found 
another state Q |0) with the same energy as |0). 

Note that the proof makes no reference to either relativity or fields. You can also see that 
it merely formalizes the picture of the marble rolling along the gutter. 

In quantum field theory, we have local currents, and so 

Q = J d D xJ°(x,t ) 

where D denotes the dimension of space and conservation of Q says that the integral can 
be evaluated at any time. Consider the state 

|i) = J d D xe~ l * x J°(x, t) |0) 

which has 3 spatial momentum k . As k goes to zero it goes over to Q | 0), which as we learned 
in (9) has zero energy. Thus, as the momentum of the state |s) goes to zero, its energy goes 
to zero. In a relativistic theory, this means precisely that |y) describes a massless particle. 


2 Y. Nambu, quite deservedly, received the 2008 physics Nobel Prize for his profound contribution to our 
understanding of spontaneous symmetry breaking. 

3 Acting on it with P' (exercise 1.11.3) and using P‘ |0) = 0, we have 


J 


P l |j) = / d D xe~ ikx [P i 


J°(x, 


t)\ |0) = —i / d 


'/■ 


x d' J°(x, t) |0) = k' |s) 


upon integrating by parts. 
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The proof makes clear that the theorem practically exudes generality: It applies to any 
spontaneously broken continuous symmetry. 


Counting Nambu-Goldstone bosons 

From our proof, we see that the number of Nambu-Goldstone bosons is clearly equal to 
the number of conserved charges that do not leave the vacuum invariant, that is, do not 
annihilate |0). For each such charge Q a , we can construct a zero-energy state Q a |0). 

In our example, we have only one current J = i (<^ 1 3 // < / 5 2 — 1 ) an d hence one 

Nambu-Goldstone boson. In general, if the Lagrangian is left invariant by a symmetry 
group G with n(G) generators, but the vacuum is left invariant by only a subgroup H of G 
with n(H) generators, then there are n(G)— n(H) Nambu-Goldstone bosons. If you want 
to show off your mastery of mathematical jargon you can say that the Nambu-Goldstone 
bosons live in the coset space G/H. 


Ferromagnet and spin wave 

The generality of the proof suggests that the usefulness of Goldstone’s theorem is not 
restricted to particle physics. In fact, it originated in condensed matter physics, the classic 
example there being the ferromagnet. The Hamiltonian, being composed of just the 
interaction of nonrelativistic electrons with the ions in the solid, is of course invariant 
under the rotation group SO( 3), but the magnetization M picks out a direction, and 
the ferromagnet is left invariant only under the subgroup S O (2) consisting of rotations 
about the axis defined by M. The Nambu-Goldstone theorem is easy to visualize physically. 
Consider a “spin wave” in which the local magnetization M(x) varies slowly from point 
to point. A physicist living in a region small compared to the wavelength does not even 
realize that he or she is no longer in the “vacuum.” Thus, the frequency of the wave must 
go to zero as the wavelength goes to infinity. This is of course exactly the same heuristic 
argument given earlier. Note that quantum mechanics is needed only to translate the wave 
vector k into momentum and the frequency co into energy. I will come back to magnets 
and spin wave in chapters V.3 and VI.5. 


Quantum fluctuations and the dimension of spacetime 

Our discussion of spontaneous symmetry breaking is essentially classical. What happens 
when quantum fluctuations are included? I will address this question in detail in chap¬ 
ter IV.3, but for now let us go back to (5). In the ground state, cpi — v and <p 2 — 0- Recall 
that in the mattress model of a scalar field theory the mass term comes from the springs 
holding the mattress to its equilibrium position. The term —fi 2 (p l2 (note the prime) in (5) 
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tells us that it costs action for <p[ to wander away from its ground state value (p[ = 0. But 
now we are worried: (p 2 is massless. Can it wander away from its ground state value? To 
answer this question let us calculate the mean square fluctuation 


{(vim 2 ) = ^J D V e iSW [<pmi 

= lim 4 f D Ve ,SW Vl(x)Vl(^) 

Z J 

f cl d k e ikx 
*->■0 J ( 2n) d k 2 


( 10 ) 


(We recognized the functional integral that defines the propagator; recall chapter 1.7.) 
The upper limit of the integral in (10) is cut off at some A (which would correspond to 
the inverse of the lattice spacing when applying these ideas to a ferromagnet) and so as 
explained in chapter III.l (and as you will see in chapter VIII.3) we are not particularly 
worried about the ultraviolet divergence for large k. But we do have to worry about a possible 
infrared divergence for small k. (Note that for a massive field 1/k 2 in (10) would have been 
replaced by 1 /(k 2 + pi 2 ) and there would be no infrared divergence.) 

We see that there is no infrared divergence for d > 2. Our picture of spontaneously 
breaking a continuous symmetry is valid in our (3 + 1)-dimensional world. 

However, for d <2 the mean square fluctuation of v >2 comes out infinite, so our naive 
picture is totally off. We have arrived at the Coleman-Mermin-Wagner theorem (proved 
independently by a particle theorist and two condensed matter theorists), which states 
that spontaneous breaking of a continuous symmetry is impossible for d — 2. Note that 
while our discussion is given for 0(2) symmetry the conclusion applies to any continuous 
symmetry since the argument depends only on the presence of Nambu-Goldstone fields. 

In our examples, symmetry is spontaneously broken by a scalar field <p, but nothing says 
that the field ip must be elementary. In many condensed matter systems, superconductors, 
for example, symmetries are spontaneously broken, but we know that the system consists 
of electrons and atomic nuclei. The field (p is generated dynamically, for example as a bound 
state of two electrons in superconductors. More on this in chapter V.4. The spontaneous 
breaking of a symmetry by a dynamically generated field is sometimes referred to as 
dynamical symmetry breaking. 4 


Exercises 

IV. 1.1 Show explicitly that there are N — 1 Nambu-Goldstone bosons in the G = 0(N ) example (2). 

I V.i .2 Construct the analog of (2) with N complex scalar fields and invariant under SU (N ). Count the number 
of Nambu-Goldstone bosons when one of the scalar fields acquires a vacuum expectation value. 


4 This chapter is dedicated to the memory of the late Jorge Swieca. 




The Pion as a Nambu-Goldstone Boson 


Crisis for field theory 

After the spectacular triumphs of quantum field theory in the electromagnetic interaction, 
physicists in the 1950s and 1960s were naturally eager to apply it to the strong and weak 
interactions. As we have already seen, field theory when applied to the weak interaction 
appeared not to be renormalizable. As for the strong interaction, field theory appeared to¬ 
tally untenable for other reasons. For one thing, as the number of experimentally observed 
hadrons (namely strongly interacting particles) proliferated, it became clear that were we 
to associate a field with each hadron the resulting field theory would be quite a mess, with 
numerous arbitrary coupling constants. But even if we were to restrict ourselves to nucle¬ 
ons and pions, the known coupling constant of the interaction between pions and nucleons 
is a large number. (Hence the term strong interaction in the first place!) The perturbative 
approach that worked so spectacularly well in quantum electrodynamics was doomed to 
failure. 

Many eminent physicists at the time advocated abandoning quantum field theory al¬ 
together, and at certain graduate schools, quantum field theory was even dropped from 
the curriculum. It was not until the early 1970s that quantum field theory made a tri¬ 
umphant comeback. A field theory for the strong interaction was formulated, not in terms 
of hadrons, but in terms of quarks and gluons. I will get to that in chapter VII.3. 


Pion weak decay 

To understand the crisis facing field theory, let us go back in time and imagine what a field 
theorist might be trying to do in the late 1950s. Since this is not a book on particle physics, 
I will merely sketch the relevant facts. You are urged to consult one of the texts on the 
subject. 1 By that time, many semileptonic decays such as n —> p + e~ + v, n~ —> e~ + v, 

1 See, e.g., E. Commins and P. H. Bucksbaum, Weak Interactions of Leptons and Quarks. 
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and n~ —>• tt° + e~ + v had been measured. Neutron /3 decay n —>• p + e~ + v was of 
course the process for which Fermi invented his theory, which by that time had assumed 
the form C = G[ey^( 1 — y 5 )v][py /i (l — y 5 )n], where n is a neutron field annihilating a 
neutron, p a proton field annihilating a proton, v a neutrino field annihilating a neutrino 
(or creating an antineutrino as in decay), and e an electron field annihilating an electron. 

It became clear that to write down a field for each hadron and a Lagrangian for each 
decay process, as theorists were in fact doing for a while, was a losing battle. Instead, we 
should write 

C = G\eyi*( 1 - k 5 )v](^ - J 5lt ) (1) 

with J^ and / 5/ , ( two currents transforming as a Lorentz vector, and axial vector respectively. 
We think of J^ and J 5 ^ as quantum operators in a canonical formulation of field theory. 
Our task would then be to calculate the matrix elements between hadron states, (p | (J ;/ — 
J 5 ; X ) | n), (0| (— i 5/x ) (7T°| (i ;x — J 5/J ) \ix~), and so on, corresponding to the three 

decay processes I listed above. (I should make clear that although I am talking about weak 
decays, the calculation of these matrix elements is a problem in the strong interaction. In 
other words, in understanding these decays, we have to treat the strong interaction to all 
orders in the strong coupling, but it suffices to treat the weak interaction to lowest order in 
the weak coupling G.) Actually, there is a precedent for the attitude we are adopting here. 
To account for nuclear f3 decay (Z, A) -> (Z + 1, A) + e~ + v, Fermi certainly did not 
write a separate Lagrangian for each nucleus. Rather, it was the task of the nuclear theorist 
to calculate the matrix element (Z + 1, A| [py^i 1 — Ys)n] |Z, A). Similarly, it is the task of 
the strong interaction theorist to calculate matrix elements such as (p\ (J j{ — 7 5/x ) \ri). 

For the story I am telling, let me focus on trying to calculate the matrix element of 
the axial vector current between a neutron and a proton. Here we make a trivial 
change in notation: We no longer indicate that we have a neutron in the initial state and a 
proton in the final state, but instead we specify the momentum p of the neutron and the 
momentum p' of the proton. Incidentally, in (1) the fields and the currents are of course 
all functions of the spacetime coordinates x. Thus, we want to calculate {p'\ Jr'(x) \p), 
but by translation invariance this is equal to {p'\ J^(0) \p)e~ l( ' p ~P^’ X . Henceforth, we 
simply calculate {p'\ J (0) | p) and suppress the 0. Note that spin labels have already been 
suppressed. 

Lorentz invariance and parity can take us some distance: They imply that 2 

(p'\ J? I P) = u(p')[y^y 5 F(q 2 ) + q^y 5 G(q 2 )}u(p) (2) 

with q = p’ — p [compare with (III. 6 .7)]. But Lorentz invariance and parity can only take 
us so far: We know nothing about the “form factors” F(q 2 ) and G(q 2 ). 


2 Another possible term of the form ( p' + p)^y^ can be shown to vanish by charge conjugation and isospin 
symmetries. 
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(a) 

Figure IV.2.1 


(b) 



Similarly, for the matrix element (0| \n ) Lorentz invariance tells us that 

<0| J? | k) = /*** (3) 

I have again labeled the initial state by the momentum k of the pion. The right-hand side 
of (3) has to be a vector but since k is the only vector available it has to be proportional 
to k. Just like F(q 2 ) and G(q 2 ), the constant / is a strong interaction quantity that we 
don’t know how to calculate. On the other hand, F(q 2 ), G(q 2 ), and / can all be measured 
experimentally. For instance, the rate for the decay —> e~ + v clearly depends on f 2 . 


Too many diagrams 

Let us look over the shoulder of a field theorist trying to calculate ( p'\ \ p) and (0| |k) 

in (2) in the late 1950s. He would draw Feynman diagrams such as the ones in figures IV.2.1 
and IV.2.2 and soon realize that it would be hopeless. Because of the strong coupling, he 
would have to calculate an infinite number of diagrams, even if the strong interaction were 
described by a field theory, a notion already rejected by many luminaries of the time. 



Figure IV.2.2 
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In telling the story of the breakthrough I am not going to follow the absolutely fascinating 
history of the subject, full of total confusion and blind alleys. Instead, with the benefit of 
hindsight, I am going to tell the story using what I regard as the best pedagogical approach. 


The pion is very light 

The breakthrough originated in the observation that the mass of the jt~ at 139 Mev was 
considerably less than the mass of the proton at 938 Mev. For a long time this was simply 
taken as a fact not in any particular need of an explanation. But eventually some theorists 
wondered why one hadron should be so much lighter than another. 

Finally, some theorists took the bold step of imagining an “ideal world” in which the 
tt~ is massless. The idea was that this ideal world would be a good approximation of our 
world, to an accuracy of about 15% (~139/938). 

Do you remember one circumstance in which a massless spinless particle would emerge 
naturally? Yes, spontaneous symmetry breaking! In one of the blinding insights that have 
characterized the history of particle physics, some theorists proposed that the n mesons 
are the Nambu-Goldstone bosons of some spontaneous broken symmetry. 

Indeed, let’s multiply (3) by k fl : 

k»(0\J?\k) = fk 2 =fml (4) 

which is equal to zero in the ideal world. Recall from our earlier discussion on translation 
invariance that 

(0| J?(x) \k) = (0| y 5 " ( 0 ) I k)e~ ik * 
and hence 

(0| d^(x) I k) = -ik„{0 | 0) I k)e~ ik - x 

Thus, if the axial current is conserved, d^J?{x) — 0, in the ideal world, k fJ (0| Jr \k) — 0 
and (4) would indeed imply m 1 — 0. 

The ideal world we are discussing enjoys a symmetry known as the chiral symmetry 
of the strong interaction. The symmetry is spontaneously broken in the ground state we 
inhabit, with the tc meson as the Nambu-Goldstone boson. The Noether current associated 
with this symmetry is the conserved Jr . 

In fact, you should recognize that the manipulation here is closely related to the proof 
of the Nambu-Goldstone theorem given in chapter IV. 1. 


Goldberger-Treiman relation 

Now comes the punchline. Multiply (2) by (// — p) fl . By the same translation invariance 
argument we just used. 


(p' - p) M (p'I ^(0) | p) = i(p'\ 9 m / 5 m (x) | p)e ,(pl p>x 
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and hence vanishes if Jlj = 0. On the other hand, multiplying the right-hand side of 
(2) by {p' — p) fJt we obtain u(p')[(p' — p)y 5 F(q 2 ) + q 2 y 5 G(q 2 )]u(p). Using the Dirac 
equation (do it!) we conclude that 

0 = 2 m N F(q 2 ) + q 2 G(q 2 ) (5) 

with m N the nucleon mass. 

The form factors F(q 2 ) and G(q 2 ) are each determined by an infinite number of 
Feynman diagrams we have no hope of calculating, but yet we have managed to relate 
them! This represents a common strategy in many areas of physics: When faced with 
various quantities we don’t know how to calculate, we can nevertheless try to relate them. 

We can go farther by letting q —>• 0 in (5). Referring to (2) we see that F(0) is measured 
experimentally in n —> p + e~ + v (the momentum transfer is negligible on the scale of 
the strong interaction). But oops, we seem to have a problem: We predict the nucleon mass 
m N — 0! 

In fact, we are saved by examining figure IV.2.lb: There are an infinite number of 
diagrams exhibiting a pole due to none other than the massless it meson, which you can 
see gives 

f^^8MNNHp')y s u(p) ( 6 ) 

<r 

When the n propagator joins onto the nucleon line, an infinite number of diagrams 
summed together gives the experimentally measured pion-nucleon coupling constant 
gnNN- Thus, referring to (2), we see that for q ~ 0 the form factor G(q 2 ) ~ fO-/q 2 )g n NN- 
Plugging into (5), we obtain the celebrated Goldberger-Treiman relation 

2 m N F (0) + fg„NN = 0 (?) 

relating four experimentally measured quantities. As might be expected, it holds with about 
a 15% error, consistent with our not living in a world with an exactly massless n meson. 


Toward a theory of the strong interaction 

The art of relating infinite sets of Feynman diagrams without calculating them, and it is an 
art form involving a great deal of cleverness, was developed into a subject called dispersion 
relations and S-matrix theory, which we mentioned briefly in chapter 111. 8 . Our present 
understanding of the strong interaction was built on this foundation. You could see from 
this example that an important component of dispersion relations was the study of the 
analyticity properties of Feynman diagrams as described in chapter III. 8 . The essence of 
the Goldberger-Treiman argument is separating the infinite number of diagrams into those 
with a pole in the complex g 2 -plane and those without a pole (but with a cut.) 

The discovery that the strong interaction contains a spontaneously broken symmetry 
provided a crucial clue to the underlying theory of the strong interaction and ultimately 
led to the concepts of quarks and gluons. 
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A note for the historian of science: Whether theoretical physicists regard a quantity as 
small or large depends (obviously) on the cultural and mental framework they grew up 
in. Treiman once told me that the notion of setting 138 Mev to zero, when the energy 
released per nucleon in nuclear fission is of order 10 Mev, struck the generation that grew 
up with the atomic bomb (as Treiman did—he was with the armed forces in the Pacific) as 
surely the height of absurdity. Now of course a new generation of young string theorists 
is perfectly comfortable in regarding anything less than the Planck energy 10 19 Gev as 
essentially zero. 



Effective Potential 


IV.3 


Quantum fluctuations and symmetry breaking 

The important phenomenon of spontaneous symmetry breaking was based on minimizing 
the classical potential energy V ((p) of a quantum field theory. It is natural to wonder how 
quantum fluctuations would change this picture. 

To motivate the discussion, consider once again (III.3.3) 

£ = ^(9y>) 2 — ^/x 2 y> 2 ~ + By 1 + C(p 4 (1) 

(Speaking of quantum fluctuations, we have to include counterterms as indicated.) What 
have you learned about this theory? For /x 2 > 0, the action is extremized at <p = 0, and 
quantizing the small fluctuations around <p — 0 we obtain scalar particles that scatter off 
each other. For /x 2 < 0, the action is extremized at some q > min , and the discrete symmetry 
q> -> —<p is spontaneously broken, as you learned in chapter IV. 1. What happens when 
/x = 0? To break or not to break, that is the question. 

A quick guess is that quantum fluctuations would break the symmetry. The /x = 0 theory 
is posed on the edge of symmetry breaking, and quantum fluctuations ought to push it 
over the brink. Think of a classical pencil perfectly balanced on its tip. Then “switch on” 
quantum mechanics. 


Wisdom of the son-in-law 

Let us follow Schwinger and Jona-Lasinio and develop the formalism that enables us to 
answer this question. Consider a scalar field theory defined by 

Z = e iW W = J 1 ( 2 ) 

[with the convenient shorthand Jcp — f d A x J {x)q>{x)\. If we can do the functional integral, 
we obtain the generating functional W(J). As explained in chapter 1.7, by differentiating 
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W with respect to the source J(x) repeatedly, we can obtain any Green’s function and 
hence any scattering amplitude we want. In particular, 

= ± f D'pJWWMx) (3) 

SJ(x) Z J 

The subscript c is used traditionally to remind us (see appendix 2 in chapter 1.8) that in a 
canonical formalism <p c (x) is the expectation value (0| <p |0) of the quantum operator cp. It 
is certainly not to be confused with the integration dummy variable in (3). The relation 
(3) determines <p c (x) as a functional of J. 

Given a functional W of / we can perform a Legendre transform to obtain a functional 
T of ip c . Legendre transform is just the fancy term for the simple relation 

r(<p c ) = W{J)~ J d 4 xJ(x)<p c (x) (4) 

The relation is simple, but be careful about what it says: It defines a functional of <p c {x) 
through the implicit dependence of J on (p c . On the right-hand side of (4) J is to be 
eliminated in favor of <f> c by solving (3). We expand the functional T ((p c ) in the form 

r (<p c ) = J d\{-V eS (<p c ) + Z((p c )(d<p c ) 2 + ■ • •] (5) 

where (• • •) indicates terms with higher and higher powers of 3. We will soon see the 
wisdom of the notation V e ff (<p c ). 

The point of the Legendre transform is that the functional derivative of T is nice and 
simple: 

5r(^ e ) f 4 8J(x) 8W(J) f 4 8J(x) 

8<p c (y) J S<p c (y) 8J(x ) J 8<p c (yy c 

= -J(y) (6) 

a relation we can think of as the “dual” of 8 W(J)/8J (x) — ip c (x). 

If you vaguely feel that you have seen this sort of manipulation before in your physics 
eduction, you are quite right! It was in a course on thermodynamics, where you learned 
about the Legendre transform relating the free energy to the energy: F — E — T S with 
F a function of the temperature T and E a function of the entropy S. Thus J and 
c p are “conjugate” pairs just like T and S (or even more clearly magnetic field H and 
magnetization M). Convince yourself that this is far more than a mere coincidence. 

For J and (p c independent of x we see from (5) that the condition (6) reduces to 

V' a (cp c ) = J (7) 

This relation makes clear what the effective potential V e f{((p c ) is good for. Let’s ask what 
happens when there is no external source J . The answer is immediate: (7) tells us that 


Ke^c) = 0 , 


( 8 ) 
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In other words, the vacuum expectation value of <p in the absence of an external source is 
determined by minimizing V t ff{(p c ). 


First order in quantum fluctuations 


All of these formal manipulations are not worth much if we cannot evaluate W(J). In 
fact, in most cases we can only evaluate e lW(J ^ = f D(pe^ s ^ +J>p 1 in the steepest descent 
approximation (see chapter 1.2). Let us turn the crank and find the steepest descent “point” 
<p s (x), namely the solution of (henceforth I will drop the subscript c as there is little risk 
of confusion) 

8[S(<p) + fd*yJ(y)<p(y)] = q 
8<p(x) 

<Ps 

or more explicitly, 

d 2 <P,{x) + V'[<p,(x)\ = J(x) (10) 


Write the dummy integration variable in (2) as <p = <p s + <p and expand to quadratic order 
in <p to obtain 

2 = e amw(j) = J D(pe «/msw+J<p] 

~ e «ms( Vs )+j<p s ] J D ~ e a/ «) J d 4 xhm 2 -v"(v s jP] 

- e (i/H)lS(<P s )+J<Ps]-l trlog[3 2 +V"(%)] 

We have used (II.5.2) to represent the determinant we get upon integrating over (p. Note 
that I have put back Planck’s constant h. Here <p s , as a solution of (10), is to be regarded as 
a function of J. 

Now that we have determined 


W(J) = [S(<p s ) + Jcp s ] + j tr log[a 2 + V"(<p s )\ + 0(h 2 ) 
it is straightforward to Legendre transform. I will go painfully slowly here: 


<P = 


8W 

~8J 


8[S{<p s ) + J(p s \8<p s 
8(p s 8J 


+ <P S + 0(H) =<p s + 0(H) 


To leading order in H, <p (namely the object formerly known as q> c ) is equal to <p s . Thus, 
from (4) we obtain 


r(.<p) = S(<p) + j tr log[9 2 + V"(<p)} + 0(H 2 ) 


( 12 ) 


Nice though this formula looks, in practice it is impossible to evaluate the trace for 
arbitrary <p(x ): We have to find all the eigenvalues of the operator d 2 + V"(<p), take their 
log, and sum. Our task simplifies drastically if we are content with studying F(^) for 
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<p independent of x, in which case V"(<p) is a constant and the operator 3 2 + V"(<p) is 
translation invariant and easily treated in momentum space: 


tr l°g[9 2 + y"i<p)\ — f d 4 x(x\ log[3 2 + V"(<p)\\x) 

= / d4 * / l°g[3 2 + V’\<p)\\k)(k\x) 

= f d * x f ^4 lo 8[“^ 2 + v "^)\ ( 13 ) 


Referring to (5), we obtain 


Veff(^) = V«p) 



_cTk_ 

(2^7 


log 



- V”«p) 
k 2 


+ 0(h 2 ) 


(14) 


known as the Coleman-Weinberg effective potential. What we computed is the order h 
correction to the classical potential V (cp). Note that we have added a ip independent constant 
to make the argument of the logarithm dimensionless. 

We can give a nice physical interpretation of (14). Let the universe be suffused with 
the scalar field <p(x) taking on the value cp, a background field so to speak. For V(cp) = 
jkL 2 (p 2 + (1/4 \)X(p 4 , we have V"(<p) — fi 2 + jX(p 2 = n(cp) 2 , which, as the notation jx{(p) 2 
suggests, we recognize as the ^-dependent effective mass squared of a scalar particle 
propagating in the background field <p. The mass squared pi 2 in the Lagrangian is corrected 
by a term \ X(p 2 due to the interaction of the particle with the background field (p. Now we 
see clearly what (14) tells us: The first term V(cp) is the classical energy density contained 
in the background <p, while the second term is the vacuum energy density of a scalar field 
with mass squared equal to V"(<p) [see (II.5.3) and exercise IV.3.4]. 


Your renormalization theory at work 


The integral in (14) is quadratically divergent, or more correctly, quadratically dependent 
on the cutoff. But no sweat, we were instructed to introduce three counterterms (of which 
only two are relevant here since <p is independent of jc). Thus, we actually have 


Veff(^) = V(<P) + - 


/ 


any 


log 


k 2 F +V"( V ) 


k 1 

K, e 


+ B<p 2 + C(P 4 + 0(h 2 ) 


(15) 


where we have Wick rotated to a Euclidean integral (see appendix D). Using (D.9) and 
integrating up to k 2 E = A 2 , we obtain (suppressing h) 


VeffO?) = V(fp) + 


yin 2 


v"i<p) 


[V »] 2 lo A 2 
64jt 2 ° 8 V"(<p) 


+ Btp 2 + C(p 4 


(16) 


As expected, since the integrand in (15) goes as 1 /k 2 E for large k E the integral depends 
quadratically and logarithmically on the cutoff A 2 . 
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Watch renormalization theory at work! Since V is a quartic polynomial in ip, V" (ip) is a 
quadratic polynomial and \V" (ip)} 2 a quartic polynomial. Thus, we have just enough coun¬ 
terterms Bip 2 + Cip 4 to absorb the cutoff dependence. This is a particularly transparent 
example of how the method of adding counterterms works. 

To see how bad things can happen in a nonrenormalizable theory, suppose in contrast 
that V is a polynomial of degree 6 in ip. Then we are allowed to have three counterterms 
Bip 2 + Cip 4 + D(p 6 , but that is not enough since [k"(^)] 2 is now a polynomial of degree 
8 . This means that we should have started out with V a polynomial of degree 8, but then 
[ V"(<p)] 2 would be a polynomial of degree 12. Clearly, the process escalates into an infinite- 
degree polynomial. We see the hallmark of a nonrenormalizable theory: its insatiable 
appetite for counterterms. 


Imposing renormalization conditions 


Waking up from the nightmare of an infinite number of counterterms chasing us, let us 
go back to the sweetly renormalizable ip 4 theory. In chapter 111.3 we fix the counterterms 
by imposing conditions on various scattering amplitudes. Here we would have to fix the 
coefficients B and C by imposing two conditions on Vgfr(^) at appropriate values of <p. We 
are working in field space, so to speak, rather than momentum space, but the conceptual 
framework is the same. 

We could proceed with the general quartic polynomial V (<p), but instead let us try to 
answer the motivating question of this chapter: What happens when pi = 0, that is, when 
V(ip) = (1/4 \)Xip 4 } The arithmetic is also simpler. 

Evaluating (16) we get 

keffl'i 0 ) = (— 2 ^ nr \2 ~a~2 

64jH 4! (167t)^ A 2 

(after absorbing some ^-independent constants into C). We see explicitly that the A 
dependence can be absorbed into B and C. 

We started out with a purely quartic V(ip). Quantum fluctuations generate a quadrati- 
cally divergent cp 2 term that we can cancel with the B counterterm. What does pi — 0 mean? 
It means that (d 2 V/dip 2 )| v=0 vanishes. To say that we have api — 0 theory means that we 
have to maintain a vanishing renormalized mass squared, defined here as the coefficient 
of ip 2 . Thus, we impose our first condition 


d K eff 


dip 2 


= 0 


<p =0 


(17) 


This is a somewhat long-winded way of saying that we want B — — (A 2 /64jr 2 )k to this 
order. 

Similarly, we might think that the second condition would be to set (d 4 V e {{/dip 4 )\ ip= Q 
equal to some coupling, but differentiating the ip 4 log ip term in V e ff four times we are 
going to get a term like log ip, which is not defined atip — 0. We are forced to impose our 
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condition on d 4 V e ^/dip 4 not at ip — 0 but at <p equal to some arbitrarily chosen mass M. 
(Recall that <p has the dimension of mass.) Thus, the second condition reads 


''eff 


dip 4 


= X(M) 


(p=M 


(18) 


where X(M) is a coupling manifestly dependent on M. 
Plugging 

. , 2 T2 + + 

4! (16 n) 1 A 2 


into (18) we see that X(M) is equal to X plus O (X 2 ) corrections, among which is a term like 
X 2 log M. We can get a clean relation by differentiating X(M): 


M dxim 

dM 


3 

167T 2 

3 

167T 2 


l 2 + Oft 3 ) 

X(M) 2 + 0[X(M) 3 ] 


(19) 


where the second equality is correct to the order indicated. This interesting relation tells us 
how the coupling X(M) depends on the mass scale M at which it is defined. Recall exercise 
III.1.3. We will come back to this relation in chapter VI.7 on the renormalization group. 

Meanwhile, let us press on. Using (18) to determine C and plugging it into y e ff we 
obtain 


y eff (y>) = ±X(M)ip 4 + (log Jy - J ) + OIX(M) 3 ] 


( 20 ) 


You are no longer surprised, I suppose, that C and the cutoff A have both disappeared. 
That’s a renormalizable theory for you! 

The fact that Vgff does not depend on the arbitrarily chosen M , namely, M(dV e ^/dM) — 
0 , reproduces (19) to the order indicated. 


Breaking by quantum fluctuations 

Now we can answer the motivating question: To break or not to break? 

Quantum fluctuations generate a correction to the potential of the form +q> 4 log ip 2 , 
but log ip 2 is whopping big and negative for small <p\ The 0(H) correction overwhelms 
the classical 0(H°) potential +ip 4 near ip = 0. Quantum fluctuations break the discrete 
symmetry ip -> — ip. 

It is easy enough to determine the minima ±^ m j n of V e ^(np) (which you should plot as 
a function of (p to get a feeling for). But closer inspection shows us that we cannot take the 
precise value of <p min seriously; V e g- has the form Xip 4 ( 1 + X log ip + ■ ■ ■) suggesting that 
the expansion parameter is actually X log tp rather than X. [Try to convince yourself that 
(• • •) starts with (A log ip) 2 .] The minima <p m j n of y e ff clearly occurs when the expansion 
parameter is of order unity. In an exercise in chapter IV.7 you will see a clever way of getting 
around this problem. 
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Fermions 


In (11) <p s plays the role of an external field while <p corresponds to a quantum field we 
integrate over. The role of <p can also be played by a fermion field 1 Jr . Consider adding 
(i$ — m — f<p)\jr to the Lagrangian. In the path integral 

Z = J (21) 

we can always choose to integrate over 1 jr first, obtaining 


Z = 


/ 


D(pe' f d4 *[2 (3 ^ 2_v Wl+ trlo S ( ^ _m_ 2V) 


( 22 ) 


Repeating the steps in (13) we find that the fermion field contributes 


y f(v) — +* 


/ 


d A p 

( 2^7 


tr log 


P ~ m ~ ftp 
P 


(23) 


to V e ff (<p). (The trace in (23) is taken over the gamma matrices.) Again from chapter II.5, 
we see that physically Vp((p) represents the vacuum energy of a fermion with the effective 
mass m(cp) = m + fq>. 

We can massage the trace of the logarithm using tr log M — log det M (II.5.12) and 
cyclically permuting factors in a determinant): 


tr log(/$ — rt) = tr log y 5 (p — a)y 5 = tr log(— p — a) 

= \ tr(log(/3 - a) + log (p + a)) + \ tr log(-l) 

= jtrlog(-l)(p 2 -a 2 ). (24) 

Hence, 


.nog^.t., 1 . 8 ^- 2108 ^ 

p 2 p z 


and so 
VA<P) - 2 i 


'/ 


P 2 ~ m(tp) 2 
— log- 


(2tt) 4 


P 


(25) 


(26) 


Contrast the overall sign with the sign in (14): the difference in sign between fermionic 
and bosonic loops was explained in chapter II.5. 

Thus, in the end the effective potential generated by the quantum fluctuations has a 
pleasing interpretation: It is just the energy density due to the fluctuating energy, entirely 
analogous to the zero point energy of the harmonic oscillator, of quantum fields living in 
the background q> (see exercise IV.3.5). 


Exercises 


iv.3.1 Consider the effective potential in (0 + 1)-dimensional spacetime: 


R eff (?) = V{<P) + 


h r dkp_ k\ + v"((p) 

2 J (2n) ° 8 k\ 


+ 0{h 2 ) 
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No counterterm is needed since the integral is perfectly convergent. But (0 + 1)-dimensional field theory 
is just quantum mechanics. Evaluate the integral and show that V e ff is in complete accord with your 
knowledge of quantum mechanics. 

IV. 3.2 Study in (1 + 1)—dimensional spacetime. 


IV. 3.3 Consider a massless fermion field x// coupled to a scalar field (p by /(piffi/s in (1 + 1)-dimensional 
spacetime. Show that 




after a suitable counterterm has been added. This result is important in condensed matter physics, as 
we will see in chapter V.5 on the Peierls instability. 


IV. 3.4 Understand (14) using Feynman diagrams. Show that Vgff is generated by an infinite number of di¬ 
agrams. [Hint: Expand the logarithm in (14) as a series in V"((p)/k 2 and try to associate a Feynman 
diagram with each term in the series.] 


IV. 3.5 Consider the electrodynamics of a complex scalar field 

£ = ~\F llv F^ + [O'* + ieA»)q> t] [0 M - ieAJv] 

+ <p - A(<pV) 2 (28) 

In a universe suffused with the scalar field <p{x) taking on the value (p independent of x as in the text, 
the Lagrangian will contain a term (ePcp^fyApA 11 so that the effective mass squared of the photon field 
becomes M(cp ) 2 = e 2 (p^<p. Show that its contribution to V e ff {(p) has the form 



Compare with (14) and (26). [Hint: Use the Landau gauge to simplify the calculation.] If you need help, 
I strongly urge you to read S. Coleman and E. Weinberg, Phys. Rev. D7: 1883, 1973, a paragon of clarity 
in exposition. 




Magnetic Monopole 


Quantum mechanics and magnetic monopoles 


Curiously enough, while electric charges are commonplace nobody has ever seen a mag¬ 
netic charge or monopole. Within classical physics we can perfectly well modify one of 
Maxwell’s equations to V ■ B — p M , with p M denoting the density of magnetic monopoles. 
The only price we have to pay is that the magnetic field B can no longer be represented 
as B = V x A since otherwise V ■ 5 = V ■ V x A — Eij k djdjA k — 0 identically. Newton and 
Leibniz told us that derivatives commute with each other. 

So what, you say. Indeed, who cares that B cannot be written as V x A? The vector po¬ 
tential A was introduced into physics only as a mathematical crutch, and indeed that is 
still how students are often taught in a course on classical electromagnetism. As the dis¬ 
tinguished nineteenth-century physicist Heaviside thundered, “Physics should be purged 
of such rubbish as the scalar and vector potentials; only the fields E and B are physical.” 

With the advent of quantum mechanics, however, Heaviside was proved to be quite 
wrong. Recall, for example, the nonrelativistic Schrodinger equation for a charged particle 
in an electromagnetic field: 


1 - - , 

-(V — ieA) 1 + e<p 

2m 


1 /r = Eljf 


( 1 ) 


Charged particles couple directly to the vector and scalar potentials A and 4>, which are 
thus seen as being more fundamental, in some sense, than the electromagnetic fields E 
and B, as I alluded to in chapter III.4. Quantum physics demands the vector potential. 

Dirac noted brilliantly that these remarks imply an intrinsic conflict between quantum 
mechanics and the concept of magnetic monopoles. Upon closer analysis, he found that 
quantum mechanics does not actually forbid the existence of magnetic monopoles. It 
allows magnetic monopoles, but only those carrying a specific amount of magnetic charge. 
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Differential forms 


For the following discussion and for the next chapter on Yang-Mills theory, it is highly 
convenient to use the language of differential forms. Fear not, we will need only a few 
elementary concepts. Let x 11 be D real variables (thus, the index /x takes on D values) 
and A^ (not necessarily the electromagnetic gauge potential in this purely mathematical 
section) be D functions of the x’s. In our applications, x ^ represent coordinates and, as 
we will see, differential forms have natural geometric interpretations. 

We call the object A = A^dx^ 1 a 1-form. The differentials dx^ are treated following New¬ 
ton and Leibniz. If we change coordinates x —> x ', then as usual dx 1 * — (dx^/dx' v )dx' v so 
that A = A^dx^ — A ll (dx l */dx' v )dx' v = A' v dx n ’. This reproduces the standard transforma¬ 
tion law of vectors under coordinate transformation A’ v — A^(dx l */dx' v ). As an example, 
consider A — cos 6 dip. Regarding 0 and ip as angular coordinates on a 2-sphere (namely 
the surface of a 3-ball), we have A e — 0 and A „ = cos 0. Similarly, we define a p-form as 
H — (1/ p\)H lll/jlr ..^ p dx IJ ' l dx IJ ' 2 ■ ■ ■ dx^p. (Repeated indices are summed, as always.) The 
“degenerate” example is that of a 0-form, call it A, which is just a scalar function of the 
coordinates x^. An example of a 2-form is F — (l/2\)F /lv dx l *dx v . 

We now face the question of how to think about products of differentials. In an ele¬ 
mentary course on calculus we learned that dx dy represents the area of an infinitesimal 
rectangle with length dx and width dy. At that level, we more or less automatically regard 
dy dx as the same as dx dy. The order of writing the differentials does not matter. However, 
think about making a coordinate transformation so that x = x(x', y') and y — y (pc', y') are 
now functions of the new coordinates x' and y' . Now look at 


dx 


dx 


dx dy = —dx' + —dy' ^dx' + —dy' 


dx 


dy' 


dy 


dy 


dx' 


dy' 


( 2 ) 


Note that the coefficient of dx'dy' is (dx/dx')(dy/dy') and that the coefficient of dy'dx' 
is (dx/dy')(dy/dx'). We see that it is much better if we regard the differentials dx^ 
as anticommuting objects [what mathematicians would call Grassmann variables (recall 
chapter (II.5)] so that dy'dx' = —dx'dy' and dx'dx' = 0 = dy'dy'. Then (2) simplifies 
neatly to 


dx dy — 


(^d_ L _d_^dy\ dx , dy , 

\ dx' dy' dy' dx' J 


J(x, y;x', y')dx'dy' 


(3) 


We obtain the correct Jacobian J(x, y; x ', y') for transforming the area element dx dy to 
the area element dx'dy'. 

In many texts, dx dy is written as dx^dy. We will omit the wedge—no reason to clutter 
up the page. 

This little exercise tells us that we should define dx^dx v — —dx v dx^ and regard the 
area element dx^dx v as directional. The area elements dx^'dx v and dx v dx ^ have the same 
magnitude but point in opposite directions. 
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We now define a differential operation d to act on any form. Acting on a p-form H, it 
gives by definition 

dH = J\ d '’ H W2-Hp dxVdxtlldx>12 ''' dx>Lp 

Thus, d A = d v Adx v and 

cl A = d v A lx dx v dx^ = ^(3 V A^ — 3 ^A v )dx v dx^ 

In the last step, we used dx fl dx v — —dx v dx^. 

We see that this mathematical formalism is almost tailor made to describe electromag¬ 
netism. If we call A = A^dx^ the potential 1-form and think of A^ as the electromag¬ 
netic potential, then F — dA is in fact the field 2-form. If we write F out in terms of its 
components F — (1/2 DF^dx^dx 1 ’, then F ' is indeed equal to the electromagnetic field 
(d IJL A v - 9 v A /i ). 

Note that is not a form, and dx^ is not d acting on a form. 

Ifyou like, you can think of differential forms as “merely” an elegantly compact notation. 
The point is to think of physical objects such as A and F as entities, without having to 
commit to any particular coordinate system. This is particularly convenient when one has 
to deal with objects more complicated than A and F , for example in string theory. By using 
differential forms, we avoid drowning in a sea of indices. 

An important identity is 

dd = 0 (4) 

which says that acting with d on any form twice gives zero. Verify this as an exercise. In 
particular dF — ddA — 0. If you write this out in components you will recognize it as a 
standard identity (the “Bianchi identity”) in electromagnetism. 


Closed is not necessarily globally exact 

It is convenient here to introduce some jargon. A p-form a is said to be closed if da — 0. 
It is said to be exact if there exists a (p — l)-form such that a — dfi. 

Talking the talk, we say that (4) tells us that exact forms are closed. 

Is the converse of (4) true? Kind of. The Poincare lemma states that a closed form is 
locally exact. In other words, if dF[ — 0 with FI some p-form, then locally 

H = dK (5) 

for some (p — l)-form K. However, it may or may not be the case that FI — dK globally, 
that is, everywhere. Actually, whether you know it or not, you are already familiar with the 
Poincare lemma. For example, surely you learned somewhere that if the curl of a vector 
field vanishes, the vector field is locally the gradient of some scalar field. 

Forms are ready made to be integrated over. For example, given the 2-form F — 
(1/2 \)F tlv dx l *dx v , we can write f M F for any 2-manifold M. Note that the measure is 
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already included and there is no need to specify a coordinate choice. Again, whether you 
know it or not, you are already familiar with the important theorem 

I dH= I H (6) 

JM J3M 

with H a p-form and 3 M the boundary of a (p + 1)-dimensional manifold M. 


Dirac quantization of magnetic charge 


After this dose of mathematics, we are ready to do some physics. Consider a sphere 
surrounding a magnetic monopole with magnetic charge g. Then the electromagnetic 
field 2-form is given by F = (g/An)dcos 9 dip. This is almost a definition of what we mean 
by a magnetic monopole (see exercise IV.4.3.) In particular, calculate the magnetic flux by 
integrating F over the sphere S 2 


[ F = 8 
Js 2 


As I have already noted, the area element is automatically included. Indeed, you might 
have recognized d cos 6 dip — — sin 9 d9 dip as precisely the area element on a unit sphere. 
Note that in “ordinary notation” (7) implies the magnetic field B — ( g/Anr 2 )r , with r the 
unit vector in the radial direction. 

I will now give a rather mathematical, but rigorous, derivation, originally developed by 
Wu and Yang, of Dirac’s quantization of the magnetic charge g. 

First, let us recall how gauge invariance works, from, for example, (II.7.3). Under a 
transformation of the electron field ty(x) e lA ^ x h /r(x), the electromagnetic gauge poten¬ 
tial changes by 


V*> v*) + V A « 

le 


or in the language of forms, 

A -» A + —e~ iA de iA (8) 

ie 

Differentiating, we can of course write 

A (x) A (x) + 1 3„A (x) 
e 

as is commonly done. The form given in (8) reminds us that gauge transformation is 
defined as multiplication by a phase factor e' A ^ x \ so that A(x) and A(x) + 2i r describe 
exactly the same transformation. 

In quantum mechanics A is physical, pace Heaviside, and so we should ask what A 
would give rise to F — (g/An) d cos 9 dip. Easy, you say; clearly A = (g/An) cos 9 dip. (In 
checking this by calculating dA, remember that dd — 0.) 

But not so fast; your mathematician friend says that dip is not defined at the north and 
south poles. Put his objection into everyday language: If you are standing on the north pole, 
what is your longitude? So strictly speaking it is forbidden to write A = (g/An) cos 9 dip. 
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But, you are smart enough to counter, then what about A N — (g/47r)(cos 0 — 1) dip, eh? 
When you act with d on A N you obtain the desired F; the added piece (g/4;r)(—1) dip gets 
annihilated by d thanks once again to the identity (4). At the north pole, cos 0 = 1, A N 
vanishes, and is thus perfectly well defined. 

OK, but your mathematician friend points out that your A N is not defined at the south 
pole, where it is equal to (g/4n)(—2)dip. 

Right, you respond, I anticipated that by adding the subscript N. I am now also forced 
to define A s = (g/ 4tt)(cos 0 + 1) dip. Note that d acting on A s again gives the desired F. 
But now A s is defined everywhere except at the north pole. 

In mathematical jargon, we say that the gauge potential A is defined locally, but not 
globally. The gauge potential A N is defined on a “coordinate patch” covering the northern 
hemisphere and extending past the equator as far south as we want as long as we do 
not include the south pole. Similarly, A s is defined on a “coordinate patch” covering the 
southern hemisphere and extending past the equator as far north as we want as long as 
we do not include the north pole. 

But what happens where the two coordinate patches overlap, for example, along the 
equator. The gauge potentials A N and A s are not the same: 

A s -A N = 2j-dip (9) 

4jt 

Now what? Aha, but this is a gauge theory: If A s and A N are related by a gauge transforma¬ 
tion, then all is well. Thus, referring to (8) we require that 2(g/4jt) dtp = (1 /ie)e~ ,A de' A 
for some phase function e' A . By inspection we have e lA = e t2(eg ^ 7Z ' >v . 

But ip = 0 and ip = 2 tt describe exactly the same point. In order for e lA to make sense, 
we must have e ' 2 ( f ’g/ 47r )( 2jr ) — e ‘ 2 (.eg/^)(P) = j n other words, e leg = 1, or 
2n 

g = —n ( 10 ) 

e 

where n denotes an integer. This is Dirac’s famous discovery that the magnetic charge on 
a magnetic monopole is quantized in units of 2n /e. A “dual” way of putting this is that if 
the monopole exists then electric charge is quantized in units of 2tz/ g. 

Note that the whole point is that F is locally but not globally exact; otherwise by (6) the 
magnetic charge g = J s iF would be zero. 

I show you this rigorous mathematical derivation partly to cut through a lot of the 
confusion typical of the derivations in elementary texts and partly because this type of 
argument is used repeatedly in more advanced areas of physics, such as string theory. 


Electromagnetic duality 

That a duality may exist between electric and magnetic fields has tantalized theoretical 
physicists for a century and a half. By the way, if you read Maxwell, you will discover that he 
often talked about magnetic charges. You can check that Maxwell’s equations are invariant 
under the elegant transformation (E + /Z?) -> e' e (E + iB) if magnetic charges exist. 
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One intriguing feature of (10) is that if e is small, then g is large, and vice versa. What 
would magnetic charges look like if they exist? They wouldn’t look any different from 
electric charges: They too interact with a 1 /r potential, with likes repelling and opposites 
attracting. In principle, we could have perfectly formulated electromagnetism in terms of 
magnetic charges, with magnetic and electric fields exchanging their roles, but the theory 
would be strongly coupled, with the coupling g rather than e. 

Theoretical physicists are interested in duality because it allows them a glimpse into 
field theories in the strongly coupled regime. Under duality, a weakly coupled field theory 
is mapped into a strongly coupled field theory. This is exactly the reason why the discovery 
some years ago that certain string theories are dual to others caused such enormous 
excitement in the string theory community: We get to know how string theories behave in 
the strongly coupled regime. More on duality in chapter VI.3. 


Forms and geometry 

The geometric character of differential forms is further clarified by thinking about the 
electromagnetic current of a charged particle tracing out the world line X^‘(t) in D- 
dimensional spacetime (see figure IV.4.1a): 

/ dxi 1 

dr - S (D) [x - X{r)\ (11) 

dr 

The interpretation of this elementary formula from electromagnetism is clear: dX^ I dr is 
the 4-velocity at a given value of the parameter r (“proper time”) and the delta function 
ensures that the current at.v vanishes unless the particle passes through x. Note that J ,l (x) 
is invariant under the reparametrization r -> r'(r). 

The generalization to an extended object is more or less obvious. Consider a string. It 
traces out a world sheet X M (t, er) in spacetime (see figure lb), where a is a parameter 
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telling us where we are along the length of the string. [For example, for a closed string, 
a is conventionally taken to range between 0 and 2 jt with X^(x , 0) = X^(r, 2jt).] The 
current associated with the string is evidently given by 



( 12 ) 


where 3 r = 3/3r and so forth. The determinant is forced on us by the requirement of 
invariance under reparametrization r —> r'(r, a), a —»■ cr'( r, cr). It follows that is an 
antisymmetric tensor. Hence, the analog of the electromagnetic potential A fl coupling to 
the current 7^ is an antisymmetric tensor field B jlv coupling to the current J^ v . Thus, 
string theory contains a 2-form potential B — jB^ v dx^dx v and the corresponding 3-form 
field H — dB. In fact, string theory typically contains numerous p-forms. 


Aharonov-Bohm efFect 

The reality of the gauge potential A was brought home forcefully in 1959 by Aharonov and 
Bohm. Consider a magnetic field B confined to a region £2 as illustrated in figure IV.4.2. 
The quantum physics of an electron is described by solving the Schrodinger equation (1). 
In Feynman’s path integral formalism the amplitude associated with a path P is modified 
by a multiplicative factor e‘ -f p ^ dx , where the line integral is evaluated along the path P. 
Thus, in the path integral calculation of the probability for an electron to propagate from 
a to b (fig. IV.4.2), there will be interference between the contributions from path 1 and 
path 2 of the form 



Pi 


a 





■w 


b 


6 = 0 


Figure IV.4.2 
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but <fi A ■ dx = J B ■ dS is precisely the flux enclosed by the closed curve ( P\ — P 2 ), namely 
the curve going from a to b along P^ and then returning from b to a along (— P 2 ) since 
complex conjugation in effect reverses the direction of the path P 2 . Remarkably, the 
electron feels the effect of the magnetic field even though it never wanders into a region 
with a magnetic field present. 

When the Aharonov-Bohm paper was first published, no less an authority than Niels 
Bohr was deeply disturbed. The effect has since been conclusively demonstrated in a series 
of beautiful experiments by Tonomura and collaborators. 

Coleman once told of a gedanken prank that connects the Aharonov-Bohm effect to 
Dirac quantization of magnetic charge. Let us fabricate an extremely thin solenoid so that 
it is essentially invisible and thread it into the lab of an unsuspecting experimentalist, 
perhaps our friend from chapter III. 1. We turn on a current and generate a magnetic field 
through the solenoid. When the experimentalist suddenly sees the magnetic flux coming 
out of apparently nowhere, she gets so excited that she starts planning to go to Stockholm. 

What is the condition that prevents the experimentalist from discovering the prank? A 
careful experimentalist might start scattering electrons around to see if she can detect a 
solenoid. The condition that she does not see an Aharonov-Bohm effect and thus does 
not discover the prank is precisely that the flux going through the solenoid is an integer 
times 2: r/e. This implies that the apparent magnetic monopole has precisely the magnetic 
charge predicted by Dirac! 


Exercises 

IV-4.1 Prove dd = 0 . 

IV.4.2 Show by writing out the components explicitly that dF = 0 expresses something that you are familiar 
with but disguised in a compact notation. 

IV.4.3 Consider F = (g/An) d cos 6 dcp. By transforming to Cartesian coordinates show that this describes a 
magnetic field pointing outward along the radial direction. 

IV.4.4 Restore the factors of h and c in Dirac’s quantization condition. 

IV.4.5 Write down the reparametrization-invariant current J^ vX of a membrane. 

IV.4.6 Let g(x) be the element of a group G. The 1 -form v = gdg ' is known as the Cartan-Maurer form. 
Then tr v N is trivially closed on an Af-dimensional manifold since it is already an AAform. Consider 
Q = J sN tr v N with S N the W-dimensional sphere. Discuss the topological meaning of Q. These con¬ 
siderations will become important later when we discuss topology in field theory in chapter V. 7 . [Hint: 
Study the case N = 3 and G = 5 / 7 ( 2 ).] 




Nonabelian Gauge Theory 


Most such ideas are eventually discarded or shelved. But some 
persist and may become obsessions. Occasionally an obsession 
does finally turn out to be something good. 

—C. N. Yang talking about an idea that he first had as a 
student and that he kept coming back to year after year . 1 


Local transformation 

It was quite a nice little idea. 

To explain the idea Yang was talking about, recall our discussion of symmetry in 
chapter 1.10. For the sake of definiteness let <p(x) — {(p\{x), <p 2 (*)> • • •, <Pn(x)} be an N- 
component complex scalar field transforming as tp(x) -> U(p(x), with U an element of 
SU(N). Since <p' —> (p' f U f and U' f U — 1, we have qA<p —> cp V and 3 (p f dcp -> dcp f dtp. The 
invariance of the Lagrangian £ —dtp f dtp — V (fp'<p) under SU(N) is obvious for any poly¬ 
nomial V. 

In the theoretical physics community there are many more people who can answer well- 
posed questions than there are people who can pose the truly important questions. The 
latter type of physicist can invariably also do much of what the former type can do, but the 
reverse is certainly not true. 

In 1954 C.N. Yang and R. Mills asked what will happen if the transformation varies from 
place to place in spacetime, or in other words, if U = U(x) is a function of x. 

Clearly, q>~^\p is still invariant. But in contrast dtp f dtp is no longer invariant. Indeed, 

*0 -*• wu<p) = ud^<p + (a„u)<p = U[d^ + (U\U)<p] 

To cancel the unwanted term (Wd^U)^, we generalize the ordinary derivative 3 /( to a 
covariant derivative D /i , which when acting on <p, gives 

Dn<P(.x) = 9^09 - iA^{x)(p(x) ( 1 ) 

The field A fl is called a gauge potential in direct analogy with electromagnetism. 

1 C. N. Yang, Selected Papers 1945-1980 with Commentary, p. 19. 
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How must Atransform, so that D^(p(x) U(x)D fl (p(x)} In other words, we would 
like D^(p(x) to transform the way 3 ^(p(x) transformed when U did not depend on x. If 
so, then [D /l <p(x)f j 'D /J p(x) —> [D /l <p(x)]’i ; D /l <p(x) and can be used as an invariant kinetic 
energy term for the field <p. 

Working backward, we see that D^cpix) -» U ( x)D /x (p(x ) if ( and it goes without saying 
that you should be checking this) 

Afi -*• UA = UA^U 1 + iUd^ (2) 

(The equality follows from UU' ( — 1.) We refer to A /( as the nonabelian gauge potential 
and to (2) as a nonabelian gauge transformation. 

Let us now make a series of simple observations. 

1. Clearly, A M have to be IV by N matrices. Work out the transformation law for A} a using 

(2) and show that the condition A /x — = 0 is preserved by the gauge transformation. 

Thus, it is consistent to take A #1 to be hermitean. Specifically, you should work out what 
this means for the group SU (2) so that U = e'®' r / 2 where 6 ■ r = 9 a r a , with r“ the familiar 
Pauli matrices. 

2. Writing U — e ,e ' T with T a the generators of SU(N), we have 

A ll ^A ll + i0 a lT a ,A ll ]+d li e a T a (3) 


under an infinitesimal transformation U ~ 1 + id - T. For most purposes, the infinitesimal 
form (3) suffices. 

3. Taking the trace of (3) we see that the trace of A„ does not transform and so we can take A„ 
to be traceless as well as hermitean. This means that we can always write A^ = A a T a and 
thus decompose the matrix field A^ into component fields A". There are as many A“’s as 
there are generators in the group [3 for SU (2), 8 for 5/7(3), and so forth.] 

4. You are reminded in appendix B that the Lie algebra of the group is defined by [T a , T b ] = 
if abc T c , where the numbers f abc are called structure constants. For example, f abc = e abc 
for SU (2). Thus, (3) can be written as 


A" - f abc O b A c + 8^9“ 


Note that if 9 does not depend on x, the A a ’s transform as the adjoint representation of the 
group. 

5. If U (x) = is just an element of the abelian group U (1), all these expressions simplify 

and A is just the abelian gauge potential familiar from electromagnetism, with (2) the usual 
abelian gauge transformation. Hence, A^ is known as the nonabelian gauge potential. 


A transformation U that depends on the spacetime coordinates x is known as a gauge 
transformation or local transformation. A Lagrangian C invariant under a gauge transfor¬ 
mation is said to be gauge invariant. 
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Construction of the field strength 

We can now immediately write a gauge invariant Lagrangian, namely 

C =(ZJ^)1(0^) - V(^q>) (5) 

but the gauge potential A M does not yet have dynamics of its own. In the familiar example 
of U (1) gauge invariance, we have written the coupling of the electromagnetic potential 
A lL to the matter field (p, but we have yet to write the Maxwell term —\F^ V F^ V in the 
Lagrangian. Our first task is to construct a field strength F /lv out of A^. How do we do 
that? Yang and Mills apparently did it by trial and error. As an exercise you might also 
want to try that before reading on. 

At this point the language of differential forms introduced in chapter IV.4 proves to 
be of use. It is convenient to absorb a factor of —i by defining A ^ = — i A^, where A^ 
denotes the gauge potential we have been using all along. Until further notice, when 
we write A^ we mean A^. Referring to (1) we see that the covariant derivative has the 
cleaner form D^ — 3^ + A /t . (Incidentally, the superscripts M and P indicate the potential 
appearing in the mathematical and physical literature, respectively.) As before, let us 
introduce A = A^dx^, now a matrix 1-form, that is, a form that also happens to be a matrix 
in the defining representation of the Lie algebra [e.g., an A by A traceless hermitean matrix 
for SU (A).] Note that 

A 2 = A ll A v dx IJ 'dx v = i[A M , A v ]dx^dx v 

is not zero for a nonabelian gauge potential. (Obviously, there is no such object in electro¬ 
magnetism.) 

Our task is to construct a 2-form F — \F llv dx ,x ‘dx v out of the 1-form A. We adopt a direct 
approach. Out of A we can construct only two possible 2-forms: dA and A 2 . So F must be 
a linear combination of the two. 

In the notation we are using the transformation law (2) reads 

A UAU' + UdU 1 (6) 

with U a 0-form (and so dU 1 = d^U'dx 11 .) Applying d to (6) we have 

dA UdAU 1 + dUAU 1 - UAdU* + dUdU T (7) 

Note the minus sign in the third term, from moving the 1-form d past the 1-form A. On 
the other hand, squaring (6) we have 

A 2 UA 2 U f + UAdU 1 + UdU f UAlt' + UdU f UdU f (8) 

"I” -j- -j- 

Applying d to UU = 1 we have UdU — —dUU . Thus, we can rewrite (8) as 


A 2 UA 2 U l + UAdU 1 - dUAU f - dUdU 1 


(9) 
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Lo and behold! If we add (7) and (9), six terms knock each other off, leaving us with 
something nice and clean: 

dA + A 2 ->• U(dA + AW (10) 

The mathematical structure thus led Yang and Mills to define the field strength 

F = dA + A 2 (11) 

Unlike A, the field strength 2-form F transforms homogeneously (10): 

F^UFU 1 (12) 

In the abelian case A 2 vanishes and F reduces to the usual electromagnetic form. In the 
nonabelian case, F is not gauge invariant, but gauge covariant. 

Of course, you can also construct F“ v without using differential forms. As an exercise 
you should do it starting with (4). The exercise will make you appreciate differential forms! 
At the very least, we can regard differential forms as an elegantly compact notation that 
suppresses the indices a and /1 in (4). At the same time, the fact that (11) emerges so 
smoothly clearly indicates a profound underlying mathematical structure. Indeed, there 
is a one-to-one translation between the physicist’s language of gauge theory and the 
mathematician’s language of fiber bundles. 

Let me show you another route to (11). In analogy to d, define D — d + A, understood 
as an operator acting on a form to its right. Let us calculate 

D 2 = (d + A)(d + A) = d 2 + dA + Ad + A 2 

The first term vanishes, the second can be written as dA — (dA) — Ad; the parenthesis 
emphasizes that d acts only on A. Thus, 

D 2 = (dA) + A 2 =F (13) 

Pretty slick? I leave it as an exercise for you to show that D 2 transforms homogeneously 
and hence so does F. 

Elegant though differential forms are, in physics it is often desirable to write more 
explicit formulas. We can write (11) out as 

F = (3 m A v + A, l A v )dx lx dx v = i(9 M A„ - 9 W A M + [A„, A v ])dx ll dx v (14) 

With the definition F = \ F )lv dx >l dx v we have 

F nv — 9 ix A v - d v A /i + [A jJL , A v ] (15) 

At this point, we might also want to switch back to physicist’s notation. Recall that A ;i 
in (15) is actually A ^ = —i A^ and so by analogy define Fjf v — —iF^ v . Thus, 

F nv = d^A v -d v A )l -i[A ll , A v \ (16) 

where, until further notice, A /( stands for A^. (One way to see the necessity for the i in (16) 
is to remember that physicists like to take A^ to be a hermitean matrix and the commutator 
of two hermitean matrices is antihermitean.) 
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As long as we are being explicit we might as well go all the way and exhibit the group 
indices as well as the Lorentz indices. We already wrote A^ = A“T a and so we naturally 
write F /lv = F“ v T a . Then (16) becomes 

F ;v=\K-^ A ‘i+f abc K A v < 17 > 

I mention in passing that for SU (2) A and F transform as vectors and the structure 
constant f abc is just s abc , so the vector notation F llv = d^A v — 3 V A + A ^ x A v is often 
used. 


The Yang-Mills Lagrangian 

Given that F transforms homogeneously (12) we can immediately write down the analog 
of the Maxwell Lagrangian, namely the Yang-Mills Lagrangian 

C = --LtrF IJV F» v (18) 

2 8 

We are normalizing T a by tr T a T h — \8 ab so that C — —(\/Ag 2 ) F“ v F a/lv . The theory 
described by this Lagrangian is known as pure Yang-Mills theory or nonabelian gauge 
theory. 

Apart from the quadratic term (3— 3 v A a ) 2 , the Lagrangian £ = —(1/4 g 2 )F^ v F a,lv 
also contains a cubic term f abc A b 11 A cv (d^A^ — 3„A“) and a quartic term (f abc A b ^A^) 2 . 
As in electromagnetism the quadratic term describes the propagation of a massless vector 
boson carrying an internal index a, known as the nonabelian gauge boson or the Yang- 
Mills boson. The cubic and quartic terms are not present in electromagnetism and describe 
the self-interaction of the nonabelian gauge boson. The corresponding Feynman rules are 
given in figure IV.5.1a, lb, and c. 

The physics behind this self-interaction of the Yang-Mills bosons is not hard to under¬ 
stand. The photon couples to charged fields but is not charged itself. Just as the charge 


YVVtAAAAAAAT 

(a) 




(b) 

Figure IV.5.1 


(c) 
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of a field tells us how the field transforms under the U( 1) gauge group, the analog of the 
charge of a field in a nonabelian gauge theory is the representation the field belongs to. The 
Yang-Mills bosons couple to all fields transforming nontrivially under the gauge group. But 
the Yang-Mills bosons themselves transform nontrivially: In fact, as we have noted, they 
transform under the adjoint representation. Thus, they must couple to themselves. 

Pure Maxwell theory is free and so essentially trivial. It contains a noninteracting photon. 
In contrast, pure Yang-Mills theory contains self-interaction and is highly nontrivial. Note 
that the structure coefficients f abc are completely fixed by group theory, and thus in 
contrast to a scalar field theory, the cubic and quartic self-interactions of the gauge bosons, 
including their relative strengths, are totally fixed by symmetry. If any 4-dimensional field 
theory can be solved exactly, pure Yang-Mills theory may be it, but in spite of the enormous 
amount of theoretical work devoted to it, it remains unsolved (see chapters VII.3 and VII.4). 


’t Hooft’s double-line formalism 

While it is convenient to use the component fields A a for many purposes, the matrix 
field A^ — A“ t T a embodies the mathematical structure of nonabelian gauge theory more 
elegantly. The propagator for the components of the matrix field in a U(N) gauge theory 
has the form 

(0|rA M (x)'A v (0)*|0> 

= (0| TA“ (x)A*(0) \Q)(T a jj{T b ) k l (19) 

<x8 ab {T a ) i AT b j b ocS!s! 

[We have gone from an SU ( N ) to a U ( N ) theory for the sake of simplicity. The generators 
of SU ( N ) satisfy a traceless condition Tr T a — 0, as a result of which we would have to 
subtract from the right-hand side.] The matrix structure AT naturally suggests that 
we, following’t Hooft, introduce a double-line formalism, in which the gauge potential is 
described by two lines, each associated with one of the two indices i and j. We choose the 
convention that the upper index flows into the diagram, while the lower index flows out 
of the diagram. The propagator in (19) is represented in figure IV.5.2a. The double-line 
formalism allows us to reproduce the index structure Sj 8j naturally. The cubic and quartic 
couplings are represented in figure IV.5.2b and c. 

The constant g introduced in (18) is known as the Yang-Mills coupling constant. We can 
always write the quadratic term in (18) in the convention commonly used in electromag¬ 
netism by a trivial rescaling A —> gA. After this rescaling, the cubic and quartic couplings 
of the Yang-Mills boson go as g and g 2 , respectively. The covariant derivative in (1) be¬ 
comes D^cp = 3 ^(p — igA^cp, showing that g also measures the coupling of the Yang-Mills 
boson to matter. The convention we used, however, brings out the mathematical structure 
more clearly. As written in (18), g 2 measures the ease with which the Yang-Mills boson can 
propagate. Recall that in chapter III.7 we also found this way of defining the coupling as 
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a measure of propagation useful in electromagnetism. We will see in chapter VIII.1 that 
Newton’s coupling appears in the same way in the Einstein-Hilbert action for gravity. 


The 0 term 

Besides tr F /iv F 11 v , we can also form the dimension-4 term s tr F ]lv F- kp . Clearly, 
this term violates time reversal invariance T and parity P since it involves one time 
index and three space indices. We will see later that the strong interaction is described 
by a nonabelian gauge theory, with the Lagrangian containing the so-called 6 term 
(0/327T 2 )e AtvAp tr F /lv F kp . As you will show in exercise IV.5.3 this term is a total diver¬ 
gence and does not contribute to the equation of motion. Nevertheless, it induces an 
electric dipole moment for the neutron. The experimental upper bound on the electric 
dipole moment for the neutron translates into an upper bound on 6 of the order 10 -9 . I 
will not go into how particle physicists resolve the problem of making sure that 0 is small 
enough or vanishes outright. 


Coupling to matter fields 

We took the scalar field tp to transform in the fundamental representation of the group. 
In general, (p can transform in an arbitrary representation 7Z of the gauge group G. We 
merely have to write the covariant derivative more generally as 

= (8 M - iAlTfciq, (20) 

where represents the uth generator in the representation 1Z (see exercise IV.5.1). 

Clearly, the prescription to turn a globally symmetric theory into a locally symmetric 
theory is to replace the ordinary derivative 3^ acting on any field, boson or fermion, 
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belonging to the representation 7Z by the covariant derivative D fl = (3^ — / A a 7^). Thus, 
the coupling of the nonabelian gauge potential to a fermion field is given by 

C = - / n)f = + Y^A^Tfa - m)f. (21) 

Fields listen to the Yang-Mills gauge bosons according to the representation TZ that 
they belong to, and those that belong to the trivial identity representation do not hear 
the call of the gauge bosons. In the special case of a 17(1) gauge theory, also known as 
electromagnetism, 7 Z corresponds to the electric charge of the field. Those fields that 
transform trivially under U (1) are electrically neutral. 


Appendix 


Let me show you another context, somewhat surprising at first sight, in which the Yang-Mills structure pops up . 2 
Consider Schrodinger’s equation 

i^(t) = H(t)V(t) (22) 

ot 

with a time dependent Hamiltonian H(t). The setup is completely general: For instance, we could be talking 
about spin states in a magnetic field or about a single particle nonrelativistic Hamiltonian with the wave function 
^(je, t). We suppress the dependence of H and ^ on variables other than time t. 

First, solve the eigenvalue problem of H{t). Suppose that because of symmetry or some other reason the 
spectrum of H(t) contains an n- fold degeneracy, in other words, there exist n distinct solutions of the equations 
H(t)\lr a (t) = with a = 1, • • •, n. Note that E{t ) can vary with time and that we are assuming that the 

degeneracy persists with time, that is, the degeneracy does not occur “accidentally" at one instant in time. We can 
always replace H(t ) by H(t) — E{t) so that henceforth we have H(t)\J/ a (t ) = 0. Also, the states can be chosen to 
be orthogonal so that (i/q?(0l Vg( 0) = (For notational reasons it is convenient to jump back and forth between 

the Schrodinger and the Dirac notation. To make it absolutely clear, we have 

(^(01^(0) = J dxf*(x, t)f a (x, t) 

if we are talking about single particle quantum mechanics.) 

Let us now study (22) in the adiabatic limit, that is, we assume that the time scale over which H(t ) varies is 
much longer than 1 / A E, where A E denotes the energy gap separating the states i/r a (t) from neighboring states. 
In that case, if ^(t) starts out in the subspace spanned by {i(f a (t)} it will stay in that subspace and we can write 
= T, a c a (f)fa(f)- Plugging this into ( 22 ) we obtain immediately J2A( dc a/dt)f a (t) + c a (t)(df a /dt)] = 0 . 
Taking the scalar product with we obtain 

5=-I> a c a (23) 

a 

with the n by n matrix 

A ba (t) = i(f b (t)\^) (24) 

Now suppose somebody else decides to use a different basis, i fr'(t) = U* c (t)iJ/ c (t), related to ours by a unitary 
transformation. (The complex conjugate on the unitary matrix U is just a notational choice so that our final 
equation will come out looking the same as a celebrated equation in the text; see below.) I have also passed 


2 F. Wilczek and A. Zee, “Appearance of Gauge Structure in Simple Dynamical Systems,” Phys. Rev. Lett. 
52:2111, 1984. 
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to the repeated indices summed notation. Differentiate to obtain (3i/r'/3f) = U* (t)(dx/r c /dt) + (dU* /dt)i/r c (t). 
Contracting this with tK*(f) = and multiplying by i, we find 


O J J "j" 

A' = UAU r + iU - ( 25 ) 

dt 

Suppose the Hamiltonian H(t) depends on d parameters A 1 , • • •, X. d . We vary the parameters, thus tracing out a 
path defined by {A^(t), /z = 1, • • •, d} in the d-dimensional parameter space. For example, for a spin Hamiltonian, 
{A^} could represent an external magnetic field. Now (23) becomes 


dc b 

dt 


E , . , (l A 1 

( A tl)ba c a—r- 


( 26 ) 


if we define (A^)^ = i(V r &|9/^a)» where 3 ^ = 3/3A^, and (25) generalizes to 


A'^UA^ + iUd^ ( 27 ) 

We have recovered (IV.5.2). Lo and behold, a Yang-Mills gauge potential A^ has popped up in front of our very 
eyes! 

The “transport” equation (26) can be formally solved by writing c(A) = Pe f A ^ dxlL t where the line integral 
is over a path connecting an initial point in the parameter space to some final point A and P denotes a path 
ordering operation. We break the path into infinitesimal segments and multiply together the noncommuting 
contribution from each segment, ordered along the path. In particular, if the path is a closed curve, by 

the time we return to the initial values of the parameters, the wave function will have acquired a matrix phase 
factor, known as the nonabelian Berry’s phase. This discussion is clearly intimately related to the discussion of 
the Aharonov-Bohm phase in the preceding chapter. 

To see this nonabelian phase, all we have to do is to find some quantum system with degeneracy in its spectrum 
and vary some external parameter such as a magnetic field. 3 In their paper, Yang and Mills spoke of the degeneracy 
of the proton and neutron under isospin in an idealized world and imagined transporting a proton from one point 
in the universe to another. That a proton at one point can be interpreted as a neutron at another necessitates the 
introduction of a nonabelian gauge potential. I find it amusing that this imagined transport can now be realized 
analogously in the laboratory. 

You will realize that the discussion here parallels the discussion in the text leading up to (IV.5.2). The spacetime 
dependent symmetry transformation corresponds to a parameter dependent change of basis. When I discuss 
gravity in chapter VIII. 1 it will become clear that moving the basis {\J/ a } around in the parameter space is the 
precise analog of parallel transporting a local coordinate frame in differential geometry and general relativity. We 
will also encounter the quantity Pe~ f A < J - dk ‘ l again in chapter VII.l in the guise of a Wilson loop. 


Exercises 

iv.5.1 Write down the Lagrangian of an SU (2) gauge theory with a scalar field in the 1 = 2 representation. 

IV.5.2 Prove the Bianchi identity DF = dF + [A, F] = 0. Write this out explicitly with indices and show that in 
the abelian case it reduces to half of Maxwell’s equations. 

IV.5.3 In 4-dimensions s^ vX p tr F^ v F^ p can be written as tr F 2 . Show that d tr F 2 = 0 in any dimensions. 

IV.5.4 Invoking the Poincare lemma (IV.4.5) and the result of exercise IV.5.3 show that tr F 2 = d tr(AdA + \ A 3 ). 
Write this last equation out explicitly with indices. Identify these quantities in the case of electromag¬ 
netism. 


3 A. Zee, “On the Non-Abelian Gauge Structure in Nuclear Quadrupole Resonance,” Phys. Rev. A38:l, 1988. 
The proposed experiment was later done by A. Pines. 
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IV. 5.5 For a challenge show that tr F 11 , which appears in higher dimensional theories such as string theory, are 
all total divergences. In other words, there exists a (2 n — l)-form co 2n _ 1 (A) such that tr F" = dco 2n -i(A). 
[Hint: A compact representation of the form a> 2n _i(A ) = L 1 dl f 2n _\ (1 , A) exists.] Work out &> 5 (A) 
explicitly and try to generalize knowing co 3 and &> 5 . Determine the (2 n — l)-form A). For help, 

see B. Zumino et al., Nucl. Phys. B239:477, 1984. 

IV. 5.6 Write down the Lagrangian of an SU (3) gauge theory with a fermion field in the fundamental or defining 
triplet representation. 




The Anderson-Higgs Mechanism 


The gauge potential eats the Nambu-Goldstone boson 

As I noted earlier the ability to ask good questions is of crucial importance in physics. 
Here is an excellent question: How does spontaneous symmetry breaking manifest itself 
in gauge theories? 

Going back to chapter IV. 1, we gauge the U (1) theory in (IV. 1.6) by replacing d^cp with 
= (3 m - ieAu)<p so that 

c = —\f ixv f iiv + (D<p)tD<p + mV? - M?V 2 (1) 

Now when we go to polar coordinates <p — pe' e we have — [d^p + ip(d^9 — eA^e' 6 
and thus 

c = -\F I1V F^ + p 2 0„0 - eA fl ) 2 + Op) 2 + mV - Xp 4 (2) 

(Compare this with C — p 2 (d fl 0) 2 + (dp) 2 + p?p 2 — Xp 4 in the absence of the gauge field.) 
Under a gauge transformation <p —> e' a (p (so that 9 -> 6 + a) and eA ^ -> eA ^ + d^a, and 
thus the combination B ^ = A ^ — (\/e)d^9 is gauge invariant. The first two terms in C 
thus become — fF llv F flv + e 2 p 2 B 2 . Note that F flv = d fx A v — 3 v A fl — d /1 B v — 3 V B /X has the 
same form in terms of the potential B ^. 

Upon spontaneous symmetry breaking, we write p = (1/V2)(v + /), with v — Jp 1 IX. 
Hence 

d = + \m 2 b\ + e 2 v X Bl + \e 2 x 2 Bl 

+ ~( 3 x) 2 - P 2 X 2 ~ vW - 7X 4 + 77 

2 4 AX 


( 3 ) 
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The theory now consists of a vector field B )t with mass 

M = e v (4) 

interacting with a scalar field / with mass V2q,. The phase field 9 , which would have been 
the Nambu-Goldstone boson in the ungauged theory, has disappeared. We say that the 
gauge field A fl has eaten the Nambu-Goldstone boson; it has gained weight and changed 
its name to B... 

Recall that a massless gauge field has only 2 degrees of freedom, while a massive gauge 
field has 3 degrees of freedom. A massless gauge field has to eat a Nambu-Goldstone boson 
in order to have the requisite number of degrees of freedom. The Nambu-Goldstone 
boson becomes the longitudinal degree of freedom of the massive gauge field. We do not 
lose any degrees of freedom, as we had better not. 

This phenomenon of a massless gauge field becoming massive by eating a Nambu- 
Goldstone boson was discovered by numerous particle physicists 1 and is known as the 
Higgs mechanism. People variously call tp, or more restrictively /, the Higgs field. The 
same phenomenon was discovered in the context of condensed matter physics by Landau, 
Ginzburg, and Anderson, and is known as the Anderson mechanism. 

Let us give a slightly more involved example, an O (3) gauge theory with a Higgs field tp a 
(a — 1, 2, 3) transforming in the vector representation. The Lagrangian contains the kinetic 
energy term jiD^tp 0 ) 2 , with D^tp a — 3 ^tp 0 + ge abc A b ^tp c as indicated in (IV.5.20). Upon 
spontaneous symmetry breaking, ip acquires a vacuum expectation value which without 
loss of generality we can choose to point in the 3-direction, so that (i p a ) = vS ai . We set 
tp 3 = v and see that 

l ( D 1 (gv)\A\ A" 1 + A\A^) (5) 

The gauge potential A 1 and A 2 t acquires mass gv [compare with (4)] while A 3 remains 
massless. 

A more elaborate example is that of an SU (5) gauge theory with tp transforming as the 24- 
dimensional adjoint representation. (See appendix B for the necessary group theory.) The 
field tp is a 5 by 5 hermitean traceless matrix. Since the adjoint representation transforms 
as tp —»■ tp + iO a {T a , tp], we have D^tp — d^tp — igA a ^[T a , tp] with a — 1, • • • , 24 running 
over the 24 generators of SU (5). By a symmetry transformation the vacuum expectation 
value of tp can be taken to be diagonal ( tp '.) = VjS 1 - (i, j — 1, • • •, 5), with J2j v j — 0- (This 
is the analog of our choosing {tp) to point in the 3-direction in the preceding example.) We 
have in the Lagrangian 

tr(Z) M ^)(ZJ'V) -*• g 2 tr [T a , {tp)][{tp), T^A^ (6) 

The gauge boson masses squared are given by the eigenvalues of the 24 by 24 matrix 
gi -]T a , (<p)][{<p), T h ], which we can compute laboriously for any given ( tp). 


1 Including P. Higgs, F. Englert, R. Brout, G. Guralnik, C. Hagen, and T. Kibble. 
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It is easy to see, however, which gauge bosons remain massless. As a specific example 
(which will be of interest to us in chapter VI1.6), suppose 


(<P) = v 


/2 0 0 
0 2 0 
0 0 2 


0 

0 

0 


° \ 
0 

0 


000-3 0 

\0 0 0 0 - 3 / 


(7) 


Which generators T a commute with (<p)} Clearly, generators of the form ( a a ) and of the 
form I 0 B ). Here A represents 3 by 3 hermitean traceless matrices (of which there are 
3 2 — 1 = 8, the so-called Gell-Mann matrices) and B represents 2 by 2 hermitean traceless 
matrices (of which there are 2 2 — 1 = 3, namely the Pauli matrices). Furthermore, the 
generator 


/2 0 0 0 0 \ 

0 2 0 0 0 

0 0 2 0 0 

000-3 0 

\0 0 0 0 - 3 / 


( 8 ) 


being proportional to (<p), obviously commutes with (<p). Clearly, these generators gen¬ 
erate SU(5), SU ( 2), and U( 1), respectively. Thus, in the 24 by 24 mass-squared matrix 
g 2 tr[T a , {(p)][{(p), T b ] there are blocks of submatrices that vanish, namely, an 8 by 8 block, 
a 3 by 3 block, and a 1 by 1 block. We have 8 + 3+1=12 massless gauge bosons. The 
remaining 24 — 12 = 12 gauge bosons acquire mass. 


Counting massless gauge bosons 

In general, consider a theory with the global symmetry group G spontaneously broken to 
a subgroup H. As we learned in chapter IV. 1, n(G) — n(H ) Nambu-Goldstone bosons 
appear. Now suppose the symmetry group G is gauged. We start with n{G) massless 
gauge bosons, one for each generator. Upon spontaneous symmetry breaking, the n(G) — 
n(H) Nambu-Goldstone bosons are eaten by 77 (G) — n(H) gauge bosons, leaving n{H) 
massless gauge bosons, exactly the right number since the gauge bosons associated with 
the surviving gauge group H should remain massless. 

In our simple example, G — U (1), H — nothing: n(G) — 1 and n(H) = 0. In our second 
example, G = 0(3), H—0(2) — U(l)\ n(G ) = 3 and n(H) = 1, and so we end up with 
one massless gauge boson. In the third example, G = SU( 5), H — SU (3) <g> SU (2) <g> U (1) 
so that n(G ) = 24 and n(H) = 12. Further examples and generalizations are worked out 
in the exercises. 
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Gauge boson mass spectrum 

It is easy enough to work out the mass spectrum explicitly. The covariant derivative of a 
Higgs field is D^cp — d^cp + gA a T a q>, where g is the gauge coupling, T a are the generators 
of the group G when acting on <p, and A a the gauge potential corresponding to the ath 
generator. Upon spontaneous symmetry breaking we replace <p by its vacuum expectation 
value (<p) — v. Hence D^q> is replaced by gA a T a v. The kinetic term \{D lL (p ■ D^cp) [here 
(•) denotes the scalar product in the group G] in the Lagrangian thus becomes 

\g\T a v ■ T b v)A im A b )l = lA" a (/x 2 ) fli, A* 

where we have introduced the mass-squared matrix 

(H 2 ) ab = g 2 (T a vT b v) (9) 

for the gauge bosons. [You will recognize (9) as the generalization of (4); also compare (5) 
and (6).] We diagonalize (fi 2 ) ab to obtain the masses of the gauge bosons. The eigenvectors 
tell us which linear combinations of A a correspond to mass eigenstates. 

Note that pi 2 is an n(G) by n(G) matrix with n(H) zero eigenvalues, whose existence can 
also be seen explicitly. Let T c be a generator of H. The statement that H remains unbroken 
by the vacuum expectation value v means that the symmetry transformation generated by 
T c leaves v invariant; in other words, T c v = 0, and hence the gauge boson associated with 
T c remains massless, as it should. All these points are particularly evident in the SU (5) 
example we worked out. 


Feynman rules in spontaneously broken gauge theories 


It is easy enough to derive the Feynman rules for spontaneously broken gauge theories. 
Take, for example, (3). As usual, we look at the terms quadratic in the fields, Fourier 
transform, and invert. We see that the gauge boson propagator is given by 


k 2 — M 2 + is 


(ft in: 


kjA 

M 2 


( 10 ) 


and the / propagator by 


i 

k 2 — Ip 2 + is 


( 11 ) 


I leave it to you to work out the rules for the interaction vertices. 

As I said in another context, field theories often exist in several equivalent forms. 
Take the U (1) theory in (1) and instead of polar coordinates go to Cartesian coordinates 
q> = (l/\/2)(^i + i(p 2 ) so that 


D^ip = d^cp - ieA^cp 


1 

Ti 


V$iAP l + e A tl ^i) + i(d/i<P 2 


eA n<Pi)] 
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Then (1) becomes 

£■ = + 1 [ 0^1 + + ( 9^ 2 - M ^!) 2 ] ( 12 ) 

Spontaneous symmetry breaking means setting yq —* v + <p' with v — )i 2 /X. 

The physical content of (12) and (3) should be the same. Indeed, expand the Lagrangian 
(12) to quadratic order in the fields: 

£ = £ - ? V'*” + \m 2a I - MA^cp 2 + 1[0^) 2 - 

+ j(9„ff2 ) 2 + "- < 13 ) 


The spectrum, a gauge boson A with mass M = ev and a scalar boson </)' with mass s/l/i , 
is identical to the spectrum in (3). (The particles there were named B and /.) 

But oops, you may have noticed something strange: the term —MA /t 9 A '^ 2 which mixes 
the fields A^ and <p 2 . Besides, why is cp 2 still hanging around? Isn’t he supposed to have 
been eaten? What to do? 

We can of course diagonalize but it is more convenient to get rid of this mixing term. 
Referring to the Fadeev-Popov quantization of gauge theories discussed in chapter III.4 we 
note that the gauge fixing term generates a term to be added to C. We can cancel the unde¬ 
sirable mixing term by choosing the gauge function to be /(A) = 9A + £einp 2 — o. Going 
through the steps, we obtain the effective Lagrangian C e ff = C — (1/2<J)(9A + £M(p 2 ) 2 
[compare with (III.4.7)]. The undesirable cross term — MA^ d^q> 2 in C is now canceled upon 
integration by parts. In C e ff the terms quadratic in A now read — \F^ lv F^ v + \M 2 A 2 l — 
(1/2if )(9 A) 2 while the terms quadratic in <p 2 read 2 [(9 M <p 2 ) 2 — £M 2 (p 2 2 \, immediately giving 
us the gauge boson propagator 


k 2 — M 2 + is 


g^v - (1 - O 


k nK 


k 2 — £M 2 + is 


(14) 


and the cp 2 propagator 


i 

k 2 — 1 IM 2 + is 


(15) 


This one-parameter class of gauge choices is known as the R = gauge. Note that the would- 
be Goldstone field q> 2 remains in the Lagrangian, but the very fact that its mass depends on 
the gauge parameter c brands it as unphysical. In any physical process, the / dependence in 
the tp 2 and A propagators must cancel out so as to leave physical amplitudes / independent. 
In exercise IV.6.9 you will verify that this is indeed the case in a simple example. 


Different gauges have different advantages 

You might wonder why we would bother with the R = gauge. Why not just use the equivalent 
formulation of the theory in (3), known as the unitary gauge, in which the gauge boson 
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propagator (10) looks much simpler than (14) and in which we don’t have to deal with the 
unphysical cp 2 field? The reason is that the R = gauge and the unitary gauge complement 
each other. In the gauge, the gauge boson propagator (14) goes as 1/k 2 for large k and 
so renormalizability can be proved rather easily. On the other hand, in the unitary gauge 
all fields are physical (hence the name “unitary”) but the gauge boson propagator (10) 
apparently goes as k lx k v /k 2 for large k; to prove renormalizability we must show that the 
k^k v piece of the propagator does not contribute. Using both gauges, we can easily prove 
that the theory is both renormalizable and unitary. By the way, note that in the limit £ -> oo 
(14) goes over to (10) and <p 2 disappear, at least formally. 

In practical calculations, there are typically many diagrams to evaluate. In the R = gauge, 
the parameter £, darn well better disappears when we add everything up to form the physical 
mass shell amplitude. The R = gauge is attractive precisely because this requirement 
provides a powerful check on practical calculations. 

I remarked earlier that strictly speaking, gauge invariance is not so much a symmetry 
as the reflection of a redundancy in the degrees of freedom used. (The photon has only 2 
degrees of freedom but we use a field A M with 4 components.) A purist would insist, in 
the same vein, that there is no such thing as spontaneously breaking a gauge symmetry. 
To understand this remark, note that spontaneous breaking amounts to setting p = \(p\ to 
v and 0 to 0 in (2). The statement \cp\ — v is perfectly U (1) invariant: It defines a circle in (p 
space. By picking out the point 6 — 0 on the circle in a globally symmetric theory we break 
the symmetry. In contrast, in a gauge theory, we can use the gauge freedom to fix 9 — 0 
everywhere in spacetime. Hence the purists. I will refrain from such hair-splitting in this 
book and continue to use the convenient language of symmetry breaking even in a gauge 
theory. 


Exercises 

IV. 6.1 Consider an SU (5) gauge theory with a Higgs field ip transforming as the 5-dimensional representation: 
(p l , i = 1, 2, • • •, 5. Show that a vacuum expectation value of ip breaks SU (5) to SU (4). Now add another 
Higgs field (p', also transforming as the 5-dimensional representation. Show that the symmetry can either 
remain at SU (4) or be broken to SU(3). 

I V. 6.2 In general, there may be several Higgs fields belonging to various representations labeled by a . Show that 
the mass squared matrix for the gauge bosons generalize immediately to ( fi 2 ) ab = 8 2 (T^v a • T^v a ), 

where v a is the vacuum expectation value of (p a and T a is the ath generator represented on cp a . Combine 
the situations described in exercises IV.6.1 and IV.6.2 and work out the mass spectrum of the gauge 
bosons. 


IV. 6.3 The gauge group G does not have to be simple; it could be of the form G\ 0 G 20 • • • 0 G^, with 
coupling constants g it g 2 , • • • , g^. Consider, for example, the case G = SU{2) 0 U( 1) and a Higgs 
field ip transforming like the doublet under SU (2) and like a field with charge \ under U (1), so that 
D^ip = d^(p — i[gA" ( r a /2) + g'Bip. Let {(p) = ^ ® j . Determine which linear combinations of the 
gauge bosons and B^ acquire mass. 

IV. 6.4 In chapter IV.5 you worked out an SU (2) gauge theory with a scalar field ip in the 1 = 2 representation. 
Write down the most general quartic potential V (ip) and study the possible symmetry breaking pattern. 
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IV. 6.5 Complete the derivation of the Feynman rules for the theory in (3) and compute the amplitude for the 
physical process x + X -*• B + B. 

IV.6.6 Derive (14). [Hint: The procedure is exactly the same as that used to obtain (III.4.9).] Write C = \ A^ Q^ v A v 
with Q^ v = (3 2 + M 2 )^" — [1 — (l/<J)]9 /i 9 v or in momentum space = — (k 2 — M 2 )g llv + [1 — 
(1 /£)]k fl k v . The propagator is the inverse of Q^ v . 

IV. 6.7 Work out the (■ ■ •) in (13) and the Feynman rules for the various interaction vertices. 

IV. 6.8 Using the Feynman rules derived in exercise IV.6.7 calculate the amplitude for the physical process <p[ + 
—* A + A and show that the dependence on c cancels out. Compare with the result in exercise IV.6.5. 
[Hint: There are two diagrams, one with A exchange and the other with ip 2 exchange.] 

IV. 6.9 Consider the theory defined in (12) with /t = 0. Using the result of exercise IV.3.5 show that 

Veff((P) = -V + — 5 (10A 2 + 3c 4 )/ (tog • (16) 

where (p 1 = This potential has a minimum away from (p = 0 and thus the gauge symmetry 

is spontaneously broken by quantum fluctuations. In chapter IV.3 we did not have the e 4 term and 
argued that the minimum we got there was not to be trusted. But here we can balance the \<p A against 
e 4 (p 4 log ((p 2 /M 2 ) for X of the same order of magnitude as e 4 . The minimum can be trusted. Show that 
the spectrum of this theory consists of a massive scalar boson and a massive vector boson, with 

m 2 (scalar) _ 3 e 2 
/H 2 (vector) 2n 4n 

For help, see S. Coleman and E. Weinberg, Phys. Rev. D7: 1888,1973. 




Chiral Anomaly 


Classical versus quantum symmetry 

I have emphasized the importance of asking good questions. Here is another good one: Is 
a symmetry of classical physics necessarily a symmetry of quantum physics? 

We have a symmetry of classical physics if a transformation^ —> <p + Sep leaves the action 
S(ep) invariant. We have a symmetry of quantum physics if the transformation leaves the 
path integral f Depe invariant. 

When our question is phrased in this path integral language, the answer seems obvious: 
Not necessarily. Indeed, the measure Dip may or may not be invariant. 

Yet historically, field theorists took as almost self-evident the notion that any symmetry 
of classical physics is necessarily a symmetry of quantum physics, and indeed, almost 
all the symmetries they encountered in the early days of field theory had the property of 
being symmetries of both classical and quantum physics. For instance, we certainly expect 
quantum mechanics to be rotational invariant. It would be very odd indeed if quantum 
fluctuations were to favor a particular direction. 

You have to appreciate the frame of mind that field theorists operated in to understand 
their shock when they discovered in the late 1960s that quantum fluctuations can indeed 
break classical symmetries. Indeed, they were so shocked as to give this phenomenon the 
rather misleading name “anomaly,” as if it were some kind of sickness of field theory. With 
the benefits of hindsight, we now understand the anomaly as being no less conceptually 
innocuous as the elementary fact that when we change integration variables in an integral 
we better not forget the Jacobian. 

With the passing of time, field theorists have developed many different ways of looking 
at the all important subject of anomaly. They are all instructive and shed different lights 
on how the anomaly comes about. For this introductory text I choose to show the existence 
of anomaly by an explicit Feynman diagram calculation. The diagram method is certainly 
more laborious and less slick than other methods, but the advantage is that you will see 
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a classical symmetry vanishing in front of your very eyes! No smooth formal argument 
for us. 


The lesser of two evils 


Consider the theory of a single massless fermion £ = xjriy^d^xjr. You can hardly ask 
for a simpler theory! Recall from chapter II.1 that £ is manifestly invariant under the 
separate transformations x/r -> e' e xjr and 1 Jr —> e‘ ey5 x\r, corresponding to the conserved 
vector current — xjry^xl/ and the conserved axial current j£ — ^ry^y^xjj respectively. 
You should verify that 3 // J ,< = 0 and = 0 follow immediately from the classical 
equation of motion i y^d^xjr = 0. 

Let us now calculate the amplitude for a spacetime history in which a fermion-anti- 
fermion pair is created at X\ and another such pair is created at x 2 by the vector current, 
with the fermion from one pair annihilating the antifermion from the other pair and the 
remaining fermion-antifermion pair being subsequently annihilated by the axial current. 
This is a long-winded way of describing the amplitude (0| T (0)J ,J '(x 1 )J v (x2) |0) in 
words, but I want to make sure that you know what I am talking about. Feynman tells 
us that the Fourier transform of this amplitude is given by the two “triangle” diagrams in 
figure IV.7.1a and b. 




tr I y k y s 


d A p 

( 2^7 

1 


A,,5 


P~ 


P~ ¥l 


yV -1_ y*-y 


P~ 


Y 


P~ ¥2 P 


( 1 ) 


with q = ki + k 2 - Note that the two terms are required by Bose statistics. The overall factor 
of (—1) comes from the closed fermion loop. 

Classically, we have two symmetries implying 3^/^ = 0 and 3 /; j£ = 0. In the quantum 
theory, if 3^ = 0 continues to hold, then we should have k ljL /\ klMV = 0 and k 2v A l ^ v = 0, 

and if 3 ;i = 0 continues to hold, then q x A kllv = 0. Now that we have things all set up, 

we merely have to calculate A k,lv to see if the two symmetries hold up under quantum 
fluctuations. No big deal. 




Figure IV.7.1 
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Before we blindly calculate, however, let us ask ourselves how sad we would be if either of 
the two currents J 11 and fails to be conserved. Well, we would be very upset if the vector 
current is not conserved. The corresponding charge Q = f d^xJ 0 counts the number of 
fermions. We wouldn’t want our fermions to disappear into thin air or pop out of nowhere. 
Furthermore, it may please us to couple the photon to the fermion field i//. In that case, 
you would recall from chapter II.7 that we need d^J 11 = 0 to prove gauge invariance and 
hence show that the photon has only two degrees of polarization. More explicitly, imagine 
a photon line coming into the vertex labeled by p in figure IV.7.1a and b with propagator 
(i/k\)[c(k h ,k^ p /k]) - g pp \. The gauge dependent term ^(k lp k lp /k\) would not go away if 
the vector current is not conserved, that is, if ky t A /IJV fails to vanish. 

On the other hand, quite frankly, just between us friends, we won’t get too upset if 
quantum fluctuation violates axial current conservation. Who cares if the axial charge 
Q s — f d^xJ® is not constant in time? 


Shifting integration variable 


So, do k lp A k ^ v and k 2v vanish? We will look over Professor Confusio’s shoulders as 
he calculates k lp A lflv . (We are now in the 1960s, long after the development of renor¬ 
malization theory as described in chapter III.l and Confusio has managed to get a tenure 
track assistant professorship.) He hits A 1/iv as written in (1) with k lp and using what he 
learned in chapter II.7 writes k] i n the first term as ft — ( ft — ki) and in the second term 
as ( ft — ki) ~ ( ft ~ 4ft thus obtaining 


k lp A^(k v k 2 ) 




(2n)‘ 


tr (y A y 


ft- 4 


1 

ft - ki 


ft - ki 


( 2 ) 


Just as in chapter II.7, Confusio recognizes that in the integrand the first term is just the 
second term with the shift of the integration variable p -> p — Aq. The two terms cancel 
and Professor Confusio publishes a paper saying k lp A A/xv = 0, as we all expect. 

Remember back in chapter II.7 I said we were going to worry later about whether it is 
legitimate to shift integration variables. Now is the time to worry! 

You could have asked your calculus teacher long ago when it is legitimate to shift 
integration variables. When is dpf{p + a) equal to /+“ dpf(p )? The difference 
between these two integrals is 


dp(a~— f (p) -4 -) = a(/(+00) - /(-oo)) -I- 

clp 

Clearly, if /(+oo) and /(—oo) are two different constants, then it is not okay to shift. But 
if the integral dpf(p) is convergent, or even logarithmically divergent, it is certainly 
okay. It was okay in chapter II.7 but definitely not here in (2)! 
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As usual, we rotate the Feynman integrand to Euclidean space. Generalizing our obser¬ 
vation above to d-dimensional Euclidean space, we have 

J d d E p[f(p + a)~ f(p)\= J d d p[a li d ll f(p)+---] 

which by Gauss’s theorem is given by a surface integral over an infinitely large sphere 
enclosing all of Euclidean spacetime and hence equal to 

l im a n f M f( P) s d _ l{ p) 

P^OO \ P J 

where S d _i(P) is the area of a (d — l)-dimensional sphere (see appendix D) and where an 
average over the surface of the sphere is understood. (Recall from our experience evaluating 
Feynman diagrams that the average of P v /P 2 is equal to \ r] ,lv by a symmetry argument, 
with the normalization \ fixed by contracting with Rotating back, we have for a 4- 
dimensional Minkowskian integral 

f d 4 p[f(p + a) - f(p )] = jirn^ ia" j /(P)(2tt 2 P 3 ) (3) 


Note the i from Wick rotating back. 
Applying (3) with 


f(p) = tr 


we obtain 


xV 


—-— y v — | = 

p- ¥2 P 


tr[x 5 ( P ~ ¥ 2 )Y v PY k ] 4ie^ x k 2r p a 


(p - k 2 ) 2 p 2 


(p - k 2 ) 2 p 2 
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1 ■ . / , , u s 
lim i(—kir - 


aXklxP °-2n 2 p'=— e 2vza k 
&n 2 


lr K 2tx 


(2n) 4 p^oo P P 4 

Contrary to what Confusio said, ki^A^VO. 

As I have already said, this would be a disaster. Fermion number is not conserved and 
matter would be disintegrating all around us! What is the way out? 

In fact, we are only marginally smarter than Professor Confusio. We did not notice that 
the integral defining A a,xv in (1) is linearly divergent and is thus not well defined. 

Oops, even before we worry about calculating k lfl A aa “’ and k 2v A Xflv we better worry 
about whether or not A^ depends on the physicist doing the calculation. In other 
words, suppose another physicist chooses 1 to shift the integration variable p in the linearly 
divergent integral in (1) by an arbitrary 4-vector a and define 

1 1 „ 1 


= (-1R 


l> k 2 ) 

•'f 


( 2n ) 4 
+ [p,, k r v, k 2 ) 


d ' p MrV 


-Y 


X" 


P+¥-4 P + ¥ 1 P + ¥ 


-) 


(4) 


There can be as many results for the Feynman diagrams in figure IV.7.1a and b as there 
are physicists! That would be the end of physics, or at least quantum field theory, for sure. 


1 This is the freedom of choice in labeling internal momenta mentioned in chapter 1.7. 
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Well, whose result should we declare to be correct? 

The only sensible answer is that we trust the person who chooses an a such that 
k 1)JL A k ^ lv (a, k\, k 2 ) and k 2v A kllv (a, k\, k 2 ) vanish, so that the photon will have the right 
number of degrees of freedom should we introduce a photon into the theory. 

Let us compute A k,lv (a, k it k 2 ) — A k,lv (ki, k 2 ) by applying (3) to f(p) = 
tr (Y X Y S ~p^Y v j=f l Y IJ 'l)- Noting that 

lim tr (/ y w w n 

P—>OQ 

_ 2P IJ tr(y k y s fy v />) - P 2 ir(y k y 5 /W 1 ) _ +4 iP 2 P a e crvilk 

p 6 p 6 

we see that 

A klll ’(a, k\, k 2 ) - A iMV ( k 1: k 2 ) = lim a <»^L e ^ + ( M , ^ v , k 2 } 

&tz 2 P ->oo p 2 

— —-e orv/U fl a + 1 m, ki v, L 2 1 (5) 

There are two independent momenta A'j and k 2 in the problem, so we can take a — 
a(k 2 + k 2 ) + (3 (k 1 — k 2 ). Plugging into (5), we obtain 

A kllv (a, k h k 2 ) = A x ^(k v k 2 ) + ±P-e k > Ava (k l - k 2 ) a (6) 

47T Z 

Note that a drops out. 

As expected, A klJV (a, k\, k 2 ) depends on /3, and hence on a. Our unshakable desire to 
have a conserved vector current, that is, k lfl A^ v (a,k 1 ,k 2 )=0, now fixes the parameter f3 
upon recalling 

Hence, we must choose to deal with A k ^ v (a, k 2 , k 2 ) with /J = — 

One way of viewing all this is to say that the Feynman rules do not suffice in determining 
(0| 7V 5 a (0) J v (x 2 ) |0). They have to be supplemented by vector current conservation. 
The amplitude (0| T J^(0)J ll (x 1 )J v (x 2 ) |0) is defined by A X/i, ’(a, k\, k 2 ) with ft = — 


Quantum fluctuation violates axial current conservation 

Now we come to the punchline of the story. We insisted that the vector current be conserved. 
Is the axial current also conserved? 

To answer this question, we merely have to compute 

q x A k ^(a, k v k 2 ) = q x A^y, k 2 ) + 


( 7 ) 
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By now, you know how to do this: 

d 4 p 


qxA^ lv {fa, fa) = 


1 


-y 


tr I Y 


-,Y 


P - 4 P - fa 


Y " 


, 1 ,-Y v —Y IJ ') + {v»k 1 *>v,k 2 } 

P~ fa P 


= ^- 2 s^ a k n k 2a 


( 8 ) 


Indeed, you recognize that the integration has already been done in (2). We finally obtain 


q^A^faa, fa, fa) = (9) 

The axial current is not conserved! 

In summary, in the simple theory C = xfriy^d^xl/ while the vector and axial currents are 
both conserved classically, quantum fluctuation destroys axial current conservation. This 
phenomenon is known variously as the anomaly, the axial anomaly, or the chiral anomaly. 


Consequences of the anomaly 


As I said, the anomaly is an extraordinarily rich subject. I will content myself with a series 
of remarks, the details of which you should work out as exercises. 


1. Suppose we gauge our simple theory £ = — ieAfa)^ andspeakof as the photon 

field. Then in figure IV. 7.1 we can think of two photon lines coming out of the vertices labeled 
p. and v. Our central result (9) can then be written elegantly as two operator equations: 

Classical physics: 3„ / 5 ,x = 0 (10) 

Quantum physics: 9 ^ (11) 

The divergence of the axial current 3 ; , Jl‘ is not zero, but is an operator capable of producing 
two photons. 

2. Applying the same type of argument as in chapter IV.2 we can calculate the rate of the 
decay n° —> y + y. Indeed, historically people used the erroneous result (10) to deduce that 
this experimentally observed decay cannot occur! See exercise IV.7.2. The resolution of this 
apparent paradox led to the correct result (11). 

3. Writing the Lagrangian in terms of left and right handed fields 1 j/ R and i fa and introducing 

the left and right handed currents 7^ =^/ R y^-^ R and , we can repackage the 

anomaly as 


v* = -- 


2 (4tt) : 


txvXa p p 

, c 1 jiv 1 Xo 


and 

d n J L = 


2 (4tt) 2 




( 12 ) 
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(Hence the name chiral!) We can think of left handed and right handed fermions running 
around the loop in figure IV.7.1, contributing oppositely to the anomaly. 

4. Consider the theory £ = \fr(iy^d^ — m)ij/. Then invariance under the transformation —> 

e lSyS \jr is spoiled by the mass term. Classically, = 2 mijriy^ilr. The axial current is 

explicitly not conserved. The anomaly now says that quantum fluctuation produces an 
additional term. In the theory £ = — ieA „) — we have 

3, V J $ = s l J - vXxr p^ v F Xa (13) 

5. Recall that in chapter 111.7 we introduced Pauli-Villars regulators to calculate vacuum 
polarization. We subtract from the integrand what the integrand would have been if the 
electron mass were replaced by some regulator mass. The analog of electron mass in (1) is 
in fact 0 and so we subtract from the integrand what the integrand would have been if 0 
were replaced by a regulator mass M. In other words, we now define 


k 2 ) = (—l)i 3 


/ 


d 4 p 

- 7 tr 

(2tt) 4 



1 

p- 4 


1 

P~ h 


1 

~P 


-kV--- Y V --- Y 11 —^— 

t>- 4-M p- fa-M p-M 

+ (m, fa v, k 2 }. 


Note that as p —> 00 the integrand now vanishes faster than 1/p 3 . This is in accordance 
with the philosophy of regularization outlined in chapters III. 1 and III.7: For p <§; M, the 
threshold of ignorance, the integrand is unchanged. But for p> M, the integrand is cut 
off. Now the integral in (14) is superficially logarithmically divergent and we can shift the 
integration variable p at will. 

So how does the chiral anomaly arise? By including the regulator mass M we have broken 
axial current conservation explicitly. The anomaly is the statement that this breaking persists 
even when we let M tend to infinity. It is extremely instructive (see exercise IV.7.4) to work 
this out. 

6. Consider the nonabelian theory £ = ifriy^i 3^ — IgA" £ a )i/r. We merely have to include in 
the Feynman amplitude a factor of T“ at the vertex labeled by fi and a factor of T b at the vertex 
labeled by v. Everything goes through as before except that in summing over all the different 
fermions that run around the loop we obtain a factor tr T a T b . Thus, we see instantly that 
in a nonabelian gauge theory 


d li J 5 = 


livXa t p p 
7 e Li 1 fiv 1 Xa 


where F ' MV = F" v T a is the matrix field strength defined in chapter IV.5. Nonabelian sym¬ 
metry tells us something remarkable: The object gi Mvla tr F )lv F ka contains not only a term 
quadratic in A, but also terms cubic and quartic in A, and hence there is also a chiral 
anomaly with three and four gauge bosons coming in, as indicated in figure IV.7.2a and 
b. Some people refer to the anomaly produced in figures IV.7.1 and IV.7.2 as the triangle, 
square, and pentagon anomaly. Historically, after the triangle anomaly was discovered, there 
was a controversy as to whether the square and pentagon anomaly existed. The nonabelian 
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yY 



yY 



Figure IV.7.2 



symmetry argument given here makes things totally obvious, but at the time people calcu¬ 
lated Feynman diagrams explicitly and, as we just saw, there are subtleties lying in wait for 
the unwary. 

7. We will see in chapter V.7 that the anomaly has deep connections to topology. 

8. We computed the chiral anomaly in the free theory C = — m) i/r. Suppose we 

couple the fermion to a scalar field by adding or to the electromagnetic field for 

that matter. Now we have to calculate higher order diagrams such as the three-loop diagram 
in figure IV.7.3. You would expect that the right-hand side of (9) would be multiplied by 
1 + h(f, e, ■ ■ •), where h is some unknown function of all the couplings in the theory. 

Surprise! Adler and Bardeen proved that h = 0. This apparently miraculous fact, known as 
the nonrenormalization of the anomaly, can be understood heuristically as follows. Before 
we integrate over the momenta of the scalar propagators in figure IV.7.3 (labeled by tty 
and u> 2 ) the Feynman integrand has seven fermion propagators and thus is more than 
sufficiently convergent that we can shift integration variables with impunity. Thus, before 
we integrate over tty and w 2 all the appropriate Ward identities are satisfied, for instance, 
Sloops (Ay, k 2 ; Wi, w 2 ) = 0. You can easily complete the proof. You will give a proof 2 based 
on topology in exercise V.7.13. 


2 For a simple proof not involving topology, see J. Collins, Renormalization, p. 352. 
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Figure IV.7.4 


9. The preceding point was of great importance in the history of particle physics as it led directly 
to the notion of color, as we will discuss in chapter VII.3. The nonrenormalization of the 
anomaly allowed the decay amplitude for n 0 -*■ y + y to be calculated with confidence in the 
late 1960s. In the quark model of the time, the amplitude is given by an infinite number of 
Feynman diagrams, as indicated in figure IV.7.4 (with a quark running around the fermion 
loop), but the nonrenormalization of the anomaly tells us that only figure IV. 7.4a contributes. 

In other words, the amplitude does not depend on the details of the strong interaction. That 
it came out a factor of 3 too small suggested that quarks come in 3 copies, as we will see in 
chapter VII.3. 

10. It is natural to speculate as to whether quarks and leptons are composites of yet more 
fundamental fermions known as preons. The nonrenormalization of the chiral anomaly 
provides a powerful tool for this sort of theoretical speculation. No matter how complicated 
the relevant interactions might be, as long as they are described by field theory as we know 
it, the anomaly at the preon level must be the same as the anomaly at the quark-lepton level. 
This so-called anomaly matching condition 3 severely constrains the possible preon theories. 

11. Historically, field theorists were deeply suspicious of the path integral, preferring the 
canonical approach. When the chiral anomaly was discovered, some people even argued that 
the existence of the anomaly proved that the path integral was wrong. Look, these people 
said, the path integral 

f DfDfe 1 (16) 

is too stupid to tell us that it is not invariant under the chiral transformation i/r —>- e‘ ey5 \jf. 
Fujikawa resolved the controversy by showing that the path integral did know about the 
anomaly: Under the chiral transformation the measure D^Di]/ changes by a Jacobian. 
Recall that this was how I motivated this chapter: The action may be invariant but not the 
path integral. 


3 G. ’t Hooft, in: G. 't Hooft et al., eds., Recent Developments in Gauge Theories; A. Zee, Phys. Lett. 95B:290,1980. 
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Exercises 


iv.7.1 


Derive (11) from (9). The momentum factors and k 2o in (9) become the two derivatives in F^ v F ka in 

( 11 ). 


IV.7.2 Following the reasoning in chapter IV.2 and using the erroneous (10) show that the decay amplitude for 
the decay tt° —> y + y would vanish in the ideal world in which the tt° is massless. Since the 7T° does 
decay and since our world is close to the ideal world, this provided the first indication historically that 
(10) cannot possibly be valid. 

IV.7.3 Repeat all the calculations in the text for the theory C = — m)ty. 


IV.7.4 Take the Pauli-Villars regulated A x ^ v {k\, k 2 ) and contract it with q The analog of the trick in chapter 11. 7 
is to write rfy 5 in the second term as [2 M + ( ft — M ) — ( ft — <j( + M)]y 5 . Now you can freely shift 
integration variables. Show that 

k 2 ) = -2Af A'‘ w (* 1 , k 2 ) (17) 


where 


A MV (^ 1; k 2 ) = (—l)/ 3 J 


d A p 

( 2^7 


tr 


/ 5 1 v 1 ,, 1 

Y - Y - Y h - 

V p- 4 - M p — lp 1 — M p-M 


+ {H,k 1 **v,k 2 ) 


Evaluate and show that A*" goes as 1/M in the limit M —>■ oo and so the right hand side of (17) 
goes to a finite limit. The anomaly is what the regulator leaves behind as it disappears from the low 
energy spectrum: It is like the smile of the Cheshire cat. [We can actually argue that A* 4 ” goes as 1 /M 
without doing a detailed calculation. By Lorentz invariance and because of the presence of y 5 , A /,l? 
must be proportional to s llvAp k- ix k 2 p , but by dimensional analysis, A /n ' must be some constant times 
B IPvXf, kix.k 2 p/M. You might ask why we can’t use something like l/(k|)J instead of 1/M to make the 
dimension come out right. The answer is that from your experience in evaluating Feynman diagrams in 
(3 + l)-dimensional spacetime you can never get a factor like l/(kf) 2 .] 


IV.7.5 There are literally N ways of deriving the anomaly. Here is another. Evaluate 

r,4 t 

1 „ 1 




tr I y V 


-Y 


-Y 




+ {/i, ki <+ v, k 2 } 


ft — qf — m ft — lfti~ m ft — m, 
in the massive fermion case not by brute force but by first using Lorentz invariance to write 


A ^ v (h, k 2 ) = s^ m k la A 1 + • • • + e^k la k 2T k\A & 

where A, = Aj(k^, kq 2 ) are eight functions of the three Lorentz scalars in the problem. You are 
supposed to fill in the dots. By counting powers as in chapters 111.3 and 111.7 show that two of these 
functions are given by superficially logarithmically divergent integrals while the other six are given by 
perfectly convergent integrals. Next, impose Bose statistics and vector current conservation k u A^ = 
0 = k 2 v A x ^ v to show that we can avoid calculating the superficially logarithmically divergent integrals. 
Compute the convergent integrals and then evaluate qxA Xflv (ki, k 2 ). 


IV.7.6 Discuss the anomaly by studying the amplitude 

(0| r/ s i (0)/ s '*(x 1 )/ s , ’0c 2 ) |0) 



280 | IV. Symmetry and Symmetry Breaking 


given in lowest orders by triangle diagrams with axial currents at each vertex. [Hint: Call the momentum 
space amplitude A k 2 ) .] Show by using (y 5 ) 2 = 1 and Bose symmetry that 

k 2 ) = 1[A a ' mv (<7, k 1: k 2 ) + A ^(a, k 2 , -q) + A v ^(a, -q, k x )\ 

Now use (9) to evaluate q 2 A^ llv (k 1 , k 2 ). 

IV. 7.7 Define the fermionic measure DxJ/ in (16) carefully by going to Euclidean space. Calculate the Jacobian 
upon a chiral transformation and derive the anomaly. [Hint: For help, see K. Fujikawa, Phys. Rev. Lett. 
42: 1195, 1979.] 

IV. 7.8 Compute the pentagon anomaly by Feynman diagrams in order to check remark 6 in the text. In other 
words, determine the coefficient c in = ■ ■ ■ + ce^ vX(T tr A^A V A X A 0 . 



Part V 


Field Theory and Collective Phenomena 


I mentioned in the introduction that one of the more intellectually satisfying developments 
in the last two or three decades has been the increasingly important role played by field 
theoretic methods in condensed matter physics. This is a rich and diverse subject; in this 
and subsequent chapters I can barely describe the tip of the iceberg and will have to content 
myself with a few selected topics. 

Historically, field theory was introduced into condensed matter physics in a rather direct 
and straightforward fashion. The nonrelativistic electrons in a condensed matter system 
can be described by a field i jr, along the lines discussed in chapter III.5. Field theoretic 
Lagrangians may then be written down, Feynman diagrams and rules developed, and so 
on and so forth. This is done in a number of specialized texts. What we present here is to a 
large extent the more modern view of an effective field theoretic description of a condensed 
matter system, valid at low energy and momentum. One of the fascinations of condensed 
matter physics is that due to highly nontrivial many body effects the low energy degrees 
of freedom might be totally different from the electrons we started out with. A particularly 
striking example (to be discussed in chapter VI.2) is the quantum Hall system, in which 
the low energy effective degree of freedom carries fractional charge and statistics. 

Another advantage of devoting a considerable portion of a field theory book to condensed 
matter physics is that historically and pedagogically it is much easier to understand the 
renormalization group in condensed matter physics than in particle physics. 

I will defiantly not stick to a legalistic separation between condensed matter and particle 
physics. Some of the topics treated in Parts V and VI actually belong to particle physics. 
And of course I cannot be responsible for explaining condensed matter physics, any more 
than I could be responsible for explaining particle physics in chapter IV.2. 
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Superfluids 


Repulsive bosons 


Consider a finite density p of nonrelativistic bosons interacting with a short ranged repul¬ 
sion. Return to (III.5.11): 

C = i<p J 'd 0 <p - - g 2 (y'~<p - p) 2 (1) 

2m 

The last term is exactly the Mexican well potential of chapter IV. 1, forcing the magnitude 
of (p to be close to Vb > thus suggesting that we use polar variables (p = </pe' d as we did 
in (III.5.7). Plugging in and dropping the total derivative (i/2)d 0 p, we obtain 


C = -pd 0 6 - — 
2m 


7-(3,P) 2 + P(3,P) 2 

4 p 


~ 8 2 (P ~ P) 2 


( 2 ) 


Spontaneous symmetry breaking 


As in chapter IV.l write ^ fp — ^fp + h (the vacuum expectation value of q> is -J~p), assume 
h <£ yfp, and expand 1 : 

C = —2y[phd 0 9 - ^-(3 fif - ——(3,/l) 2 - 4 g 2 ph 2 +■■■ (3) 

2m 2m 

Picking out the terms up to quadratic in h in (3) we use the “central identity of quantum 
field theory” (see appendix A) to integrate out h , obtaining 


C = pdftd 


4 g 2 P - ( 1 / 2 , 77 ) 3 / 

2 P 2 


2 do e-JL (d .9) 2 +. 


2m 


= A( 9 o d) 2 -^(d i e^ + - 

4 g A 2m 


( 4 ) 


1 Note that we have dropped the (potentially interesting) term —pd^O because it is a total divergence. 
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In the second equality we assumed that we are looking at processes with wave number k 
small compared to ^8 g 2 pm so that (l/2m)3 2 is negligible compared to 4 g 2 p. Thus, we see 
that there exists in this fluid of bosons a gapless mode (often referred to as the phonon) 
with the dispersion 



m 


The learned among you will have realized that we have obtained Bogoliubov’s classic result 
without ever doing a Bogoliubov rotation. 2 

Let me briefly remind you of Landau’s idealized argument 3 that a linearly dispersing 
mode (that is, a> is linear in k) implies superfluidity. Consider a mass M of fluid flowing 
down a tube with velocity v. It could lose momentum and slow down to velocity v' 
by creating an excitation of momentum k: Mv = Mv' + hk. This is only possible with 
sufficient energy to spare if \Mv 2 > \Mv' 2 + hcoik). Eliminating v' we obtain for M 
macroscopic v > co/k. For a linearly dispersing mode this gives a critical velocity v c = to/k 
below which the fluid cannot lose momentum and is hence super. [Thus, from (5) the 
idealized v c = g^/2pjm.] 

Suitably scaling the distance variable, we can summarize the low energy physics of 
superfluidity in the compact Lagrangian 

£=-^( 9^) 2 ( 6 ) 

4 g 2 

which we recognize as the massless version of the scalar field theory we studied in part I, 
but with the important proviso that 9 is a phase angle field, that is, 9(x) and 9(x) + 2j r are 
really the same. This gapless mode is evidently the Nambu-Goldstone boson associated 
with the spontaneous breaking of the global 1/(1) symmetry tp —> e la <p. 


Linearly dispersing gapless mode 

The physics here becomes particularly clear if we think about a gas of free bosons. We can 
give a momentum hk to any given boson at the cost of only (hk) 2 /2m in energy. There 
exist many low energy excitations in a free boson system. But as soon as a short ranged 
repulsion is turned on between the bosons, a boson moving with momentum k would 
affect all the other bosons. A density wave is set up as a result, with energy proportional 
to k as we have shown in (5). The gapless mode has gone from quadratically dispersing to 
linearly dispersing. There are far fewer low energy excitations. Specifically, recall that the 
density of states is given by N(E) oc k D ~ l (dk/dE). For example, for D = 2 the density of 
states goes from N(E) oc constant (in the presence of quadratically dispersing modes) to 
N(E) oc E (in the presence of linearly dispersing modes) at low energies. 

2 L. D. Landau and E. M. Lifshitz, Statistical Physics, p. 238. 

3 Ibid., p. 192. 
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As was emphasized by Feynman 4 among others, the physics of superfluidity lies not 
in the presence of gapless excitations, but in the paucity of gapless excitations. (After all, 
the Fermi liquid has a continuum of gapless modes.) There are too few modes that the 
superfluid can lose energy and momentum to. 


Relativistic versus nonrelativistic 

This is a good place to discuss one subtle difference between spontaneous symmetry 
breaking in relativistic and nonrelativistic theories. Consider the relativistic theory studied 
in chapter IV. 1: C =(3<t>f)(3d>) — k(<t> i <t> — v 2 ) 2 . It is often convenient to take the X —»■ 00 
limit holding v fixed. In the language used in chapter IV. 1 “climbing the wall” costs 
infinitely more energy than “rolling along the gutter.” The resulting theory is defined by 

C =(3<f>t)(9<j>) (7) 

with the constraint <t> i'd> = v 2 . This is known as a nonlinear a model, about which much 
more in chapter VI.4. 

The existence of a Nambu-Goldstone boson is particularly easy to see in the nonlinear a 
model. The constraint is solved by <J> = ve' e , which when plugged into C gives C — v 2 (dd) 2 . 
There it is: the Nambu-Goldstone boson 6. 

Let’s repeat this in the nonrelativistic domain. Take the limit g 2 -+ 00 with p held fixed 
so that (1) becomes 

C = i<p Jf d 0 <p - ^-di^fycp ( 8 ) 

2m 

with the constraint (p ''(p — p. But now if we plug the solution of the constraint tp — */pe ,e 
into C (and drop the total derivative —p3 o 0), we get C — — (p/2w)(3,-0) 2 with the equation 
of motion 3 2 0 = 0. Oops, what is this? It’s not even a propagating degree of freedom? 
Where is the Nambu-Goldstone boson? 

Knowing what I already told you, you are not going to be puzzled by this apparent 
paradox 5 for long, but believe me, I have stumped quite a few excellent relativistic minds 
with this one. The Nambu-Goldstone boson is still there, but as we can see from (5) its 
propagation velocity to/ k scales to infinity as g and thus it disappears from the spectrum 
for any nonzero k. 

Why is it that we are allowed to go to this “nonlinear” limit in the relativistic case? 
Because we have Lorentz invariance! The velocity of a linearly dispersing mode, if such a 
mode exists, is guaranteed to be equal to 1. 


4 R. P. Feynman, Statistical Mechanics. 

5 This apparent paradox was discussed by A. Zee, “From Semionics to Topological Fluids” in O. J. P. Ebolic et 
al., eds., Particle Physics, p. 415. 
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Exercises 

V.i.i Verify that the approximation used to reach (3) is consistent. 

V.1.2 To confine the superfluid in an external potential W(x) we would add the term — W(x)(p^(x, t)(p(x, t ) 
to (1). Derive the corresponding equation of motion for ip. The equation, known as the Gross-Pitaevski 
equation, has been much studied in recent years in connection with the Bose-Einstein condensate. 



\ / Euclid, Boltzmann, Hawking, and 
Field Theory at Finite Temperature 


Statistical mechanics and Euclidean field theory 

I mentioned in chapter 1.2 that to define the path integral more rigorously we should 
perform a Wick rotation t — —it E . The scalar field theory, instead of being defined by the 
Minkowskian path integral 

Z = j S“ d x[\m^-v W ] (1) 

is then defined by the Euclidean functional integral 

Z = J D(pe~ a/h) $ d E x[1 i (a ' p)2+V( ' p)] = J D<pe~ iyH)£w (2) 

where cl d x = —idfx, withd^x = dt E d {d ~ 1) x. In (1) (3 cp) 2 = (dcp/dt) 2 — {Vtp) 2 , while in (2) 
(3 q>) 2 — ( dcp/dt E ) 2 + (V^)) 2 : The notation is a tad confusing but I am trying not to introduce 
too many extraneous symbols. You may or may not find it helpful to think of (Vy?) 2 + V (tp) 
as one unit, untouched by Wick rotation. I have introduced £{<p) = J d d E x[\(d(p) 2 + V(<p)\ 
which may naturally be regarded as a static energy functional of the field <p(x). Thus, 
given a configuration <p(x) in d-dimensional space, the more it varies, the less likely it is 
to contribute to the Euclidean functional integral Z. 

The Euclidean functional integral (2) may remind you of statistical mechanics. Indeed, 
Herr Boltzmann taught us that in thermal equilibrium at temperature T — 1//3, the 
probability for a configuration to occur in a classical system or the probability for a state to 
occur in a quantum system is just the Boltzmann factor e~^ E suitably normalized, where E 
is to be interpreted as the energy of the configuration in a classical system or as the energy 
eigenvalue of the state in a quantum system. In particular, recall the classical statistical 
mechanics of an -particle system for which 

E(p, = T - ^ 2 + «2> • • •. <1n) 

2m 




288 | V. Field Theory and Collective Phenomena 

The partition function is given (up to some overall constant) by 

Z = n / dp^e-^* 

i 

After doing the integrals over p we are left with the (reduced) partition function 

Z = n/ d qi e~ pv(qi ’ 92 ’'"’ qN) 

i 

Promoting this to a field theory as in chapter 1.3, letting i —> x and q t -> (p(x) as before, we 
see that the partition function of a classical field theory with the static energy functional 
£(ip) has precisely the form in (2), upon identifying the symbol h as the temperature 
T = 1/d. Thus, 

Euclidean quantum field theory in d-dimensional spacetime 

(3) 

~ Classical statistical mechanics in d-dimensional space 


Functional integral representation of the quantum partition function 

More interestingly, we move on to quantum statistical mechanics. The integration over 
phase space {p, q] is replaced by a trace, that is, a sum over states: Thus the partition 
function of a quantum mechanical system (say of a single particle to be definite) with the 
Hamiltonian H is given by 

Z = tr e~ pH = Y^^\e~ ftH \n) 

n 

In chapter 1.2 we worked out the integral representation of {F\ e~ ,HT |7). (You should 
not confuse the time T with the temperature T of course.) Suppose we want an integral 
representation of the partition function. No need to do any more work! We simply replace 
the time T by — if}, set |/) = | F) = \ n) and sum over | n) to obtain 

Z = tie~ f>H =f Dqe~J'o dzL(q) ( 4 ) 

J PBC 

Tracing the steps from (1.2.3) to (1.2.5) you can verify that here L(q) = \(dq/dr) 2 + V(q) 
is precisely the Lagrangian corresponding to FI in the Euclidean time r. The integral over 
r runs from 0 to d • The trace operation sets the initial and final states equal and so the 
functional integral should be done over all paths q (r) with the boundary condition q (0) = 
q(P). The subscript PBC reminds us of this all important periodic boundary condition. 

The extension to field theory is immediate. If FI is the Hamiltonian of a quantum field 
theory in Z)-dimensional space [and hence d — (D + 1)-dimensional spacetime], then the 
partition function (4) is 

Z = tr e-W = f Dcpe~ fo dz f 
J PBC 


( 5 ) 
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with the integral evaluated over all paths <p(x, r) such that 
1 p(x, 0) = < p(x, fi) 


( 6 ) 


(Here represents all the Bose fields in the theory.) 

A remarkable result indeed! To study a field theory at finite temperature all we have to 
do is rotate it to Euclidean space and impose the boundary condition (6). Thus, 


Euclidean quantum field theory in (D + 1)-dimensional 
spacetime, 0 < r < /S 

~ Quantum statistical mechanics in D-dimensional space 


(7) 


In the zero temperature limit —> 00 we recover from (5) the standard Wick-rotated 
quantum field theory over an infinite spacetime, as we should. 

Surely you would hit it big with mystical types if you were to tell them that temperature 
is equivalent to cyclic imaginary time. At the arithmetic level this connection comes merely 
from the fact that the central objects in quantum physics e~ lHT and in thermal physics 
e~^ H are formally related by analytic continuation. Some physicists, myself included, feel 
that there may be something profound here that we have not quite understood. 


Finite temperature Feynman diagrams 


If we so desire, we can develop the finite temperature perturbation theory of (5), working 
out the Feynman rules and so forth. Everything goes through as before with one major 
difference stemming from the condition (6) (p(x , r = 0) = cp(x, r = fi). Clearly, when 
we Fourier transform with the factor e' WT , the Euclidean frequency a> can take on only 
discrete values &>„ = (2jr//j)n, with;? an integer. The propagator of the scalar field becomes 
l/(k 2 + k 2 ) —>■ 1 /(oj 2 + k 2 ). Thus, to evaluate the partition function, we simply write the 
relevant Euclidean Feynman diagrams and instead of integrating over frequency we sum 
over a discrete set of frequencies a> n — ( 2nT)n , /? = — 00 , • • •, + 00 . In other words, after 
you beat a Feynman integral down to the form f d d E kF{k 2 E ), all you have to do is replace it 
by 2nT / d D kF[(2nT) 2 n 2 + k 2 ]. 

It is instructive to see what happens in the high-temperature T —>• 00 limit. In summing 
over &>„, the n — 0 term dominates since the combination {2ttT) 2 ii 2 + k 2 occurs in the 
denominator. Hence, the diagrams are evaluated effectively in D -dimensional space. We 
lose a dimension! Thus, 


Euclidean quantum field theory in D-dimensional spacetime 
~ High-temperature quantum statistical mechanics in 
D-dimensional space 


( 8 ) 


This is just the statement that at high-temperature quantum statistical mechanics goes 
classical [compare (3)]. 
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An important application of quantum field theory at finite temperature is to cosmology: 
The early universe may be described as a soup of elementary particles at some high 
temperature. 


Hawking radiation 


Hawking radiation from black holes is surely the most striking prediction of gravitational 
physics of the last few decades. The notion of black holes goes all the way back to Michell 
and Laplace, who noted that the escape velocity from a sufficiently massive object may 
exceed the speed of light. Classically, things fall into black holes and that’s that. But with 
quantum physics a black hole can in fact radiate like a black body at a characteristic 
temperature T. 

Remarkably, with what little we learned in chapter 1.11 and here, we can actually 
determine the black hole temperature. I hasten to add that a systematic development would 
be quite involved and fraught with subtleties; indeed, entire books are devoted to this 
subject. However, what we need to do is more or less clear. Starting with chapter 1.11, we 
would have to develop quantum field theory (for instance, that of a scalar field tp) in curved 
spacetime, in particular in the presence of a black hole, and ask what a vacuum state (i.e., 
a state devoid of tp quanta) in the far past evolves into in the far future. We would find a 
state filled with a thermal distribution of <p quanta. We will not do this here. 

In hindsight, people have given numerous heuristic arguments for Hawking radiation. 
Here is one. Let us look at the Schwarzschild solution (see chapter 1.11) 

ds 2 = ( 1 — A \ ^ 2 _ ,.2jg2 _ r 2 s j n 2 q ^^2 (9) 


At the horizon r = 2GM, the coefficients of dt 2 and dr 2 change sign, indicating that 
time and space, and hence energy and momentum, are interchanged. Clearly, something 
strange must occur. With quantum fluctuations, particle and antiparticle pairs are always 
popping in and out of the vacuum, but normally, as we had discussed earlier, the uncer¬ 
tainty principle limits the amount of time At the pairs can exist to ~ 1/A E. Near the black 
hole horizon, the situation is different. A pair can fluctuate out of the vacuum right at the 
horizon, with the particle just outside the horizon and the antiparticle just inside; heuris- 
tically the Heisenberg restriction on At may be evaded since what is meant by energy 
changes as we cross the horizon. The antiparticle falls in while the particle escapes to spa¬ 
tial infinity. Of course, a hand-waving argument like this has to be backed up by detailed 
calculations. 

If black holes do indeed radiate at a definite temperature T , and that is far from obvious 
a priori, we can estimate T easily by dimensional analysis. From (9) we see that only the 
combination GM, which evidently has the dimension of a length, can come in. Since T 
has the dimension of mass, that is, length inverse, we can only have T ocl/GM. 

To determine T precisely, we resort to a rather slick argument. I warn you from the 
outset that the argument will be slick and should be taken with a grain of salt. It is only 
meant to whet your appetite for a more correct treatment. 
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Imagine quantizing a scalar field theory in the Schwarzschild metric, along the line 
described in chapter 1.11. If upon Wick rotation the field “feels” that time is periodic with 
period P , then according to what we have learned in this chapter the quanta of the scalar 
field would think that they are living in a heat bath with temperature T = 1/ ft. 

Setting t —► —it, we rotate the metric to 


ds 2 



dr 2 + r 2 dd 2 + r 2 sin 2 6d<j> 1 


( 10 ) 


In the region just outside the horizon r ^ 2GM, we perform the general coordinate trans¬ 
formation (r, r) —* (a, R ) so that the first two terms in ds 2 become R 2 da 2 + dR 2 , namely 
the length element squared of flat 2-dimensional Euclidean space in polar coordinates. 

To leading order, we can write the Schwarzchild factor (1 — 2 GM/r) as 
(r — 2GM)/(2GM) = y 2 R 2 with the constant y to be determined. Then the second term 
becomes dr 2 /(y 2 R 2 ) — ( AGM) 2 y 2 dR 2 , and thus we set y = 1/(4 GM) to get the desired 
dR 2 . The first two terms in —ds 2 are then given by R 2 (dr/(AGM)) 2 + dR 2 . Thus the Eu¬ 
clidean time is related to the polar angle by r = 4 GMa and so has a period of 87 rGM = p. 
We obtain thus the Hawking temperature 



SjtGM SjtGM 


( 11 ) 


Restoring H by dimensional analysis, we see that Hawking radiation is indeed a quantum 
effect. 

It is interesting to note that the Wick rotated geometry just outside the horizon is given 
by the direct product of a plane with a 2-sphere of radius 2 GM, although, this observation 
is not needed for the calculation we just did. 


Exercises 

V.2.1 Study the free field theory C = \(d(p) 2 — \m 2 q> 2 at finite temperature and derive the Bose-Einstein 
distribution. 

V.2.2 It probably does not surprise you that for fermionic fields the periodic boundary condition (6) is replaced 
by an antiperiodic boundary condition 0) = — ifr(x, /3) in order to reproduce the results of chap¬ 
ter 11.5. Prove this by looking at the simplest fermionic functional integral. [Hint: The clearest exposition 
of this satisfying fact may be found in appendix A of R. Dashen, B. Hasslacher, and A. Neveu, Phys. Rev. 
D12: 2443, 1975.] 

V.2.3 It is interesting to consider quantum field theory at finite density, as may occur in dense astrophysical 
objects or in heavy ion collisions. (In the previous chapter we studied a system of bosons at finite density 
and zero temperature.) In statistical mechanics we learned to go from the partition function to the grand 
partition function Z = tr e -P( H ~v N ) t where a chemical potential \i is introduced for every conserved 
particle number N . For example, for noninteracting relativistic fermions, the Lagrangian is modified 
to C = \js(i jb — m)^ + Note that finite density, as well as finite temperature, breaks Lorentz 

invariance. Develop the subject of quantum field theory at finite density as far as you can. 




Landau-Ginzburg Theory of Critical Phenomena 


The emergence of nonanalyticity 

Historically, the notion of spontaneous symmetry breaking, originating in the work of 
Landau and Ginzburg on second-order phase transitions, came into particle physics from 
condensed matter physics. 

Consider a ferromagnetic material in thermal equilibrium at temperature T. The mag¬ 
netization M(x) is defined as the average of the atomic magnetic moments taken over 
a region of a size much larger than the length scale characteristic of the relevant micro¬ 
scopic physics. (In this chapter, we are discussing a nonrelativistic theory and.r denotes the 
spatial coordinates only.) We know that at low temperatures, rotational invariance is spon¬ 
taneously broken and that the material exhibits a bulk magnetization pointing in some 
direction. As the temperature is raised past some critical temperature T c the bulk mag¬ 
netization suddenly disappears. We understand that with increased thermal agitation the 
atomic magnetic moments point in increasingly random directions, canceling each other 
out. More precisely, it was found experimentally that just below T c the magnetization \M\ 
vanishes as ~ (T c — T)&, where the so-called critical exponent /I ~ 0.37. 

This sudden change is known as a second order phase transition, an example of a 
critical phenomenon. Historically, critical phenomena presented a challenge to theoretical 
physicists. In principle, we are to compute the partition function Z = tr e~ n ^ T with 
the microscopic Hamiltonian Ti, but Z is apparently smooth in T except possibly at 
T — 0. Some physicists went as far as saying that nonanalytic behavior such as (T c — 
T)P is impossible and that within experimental error \M\ actually vanishes as a smooth 
function of T. Part of the importance of Onsager’s famous exact solution in 1944 of the 2- 
dimensional Ising model is that it settled this question definitively. The secret is that an 
infinite sum of terms each of which may be analytic in some variable need not be analytic 
in that variable. The trace in tr sums over an infinite number of terms. 
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Arguing from symmetry 


In most situations, it is essentially impossible to calculate Z starting with the microscopic 
Hamiltonian. Landau and Ginzburg had the brilliant insight that the form of the free 
energy G as a function of M for a system with volume V could be argued from general 
principles. First, for M constant in x, we have by rotational invariance 

G = V[aM 2 + b(M 2 ) 2 + • • •] ( 1 ) 


where a, b, ■ ■ ■ are unknown (but expected to be smooth) functions of T. Landau and 
Ginzburg supposed that a vanishes at some temperature T c . Unless there is some special 
reason, we would expect that for T near T c we have a — flj(T — T c ) + • • • [rather than, say, 
a = ci 2 (T — T c ) 2 +•••]. But you already learned in chapter IV. 1 what would happen. For 
T > T c , G is minimized at M = 0, but as T drops below T c , new minima suddenly develop 
at \M\ — sj(—a/2b) ~ (T c — T) 2 . Rotational symmetry is spontaneously broken, and the 
mysterious nonanalytic behavior pops out easily. 

To include the possibility of M varying in space, Landau and Ginzburg argued that G 
must have the form 


G = J d 3 x[diMdiM + aM 2 + b(M 2 ) 2 + • • •) ( 2 ) 

where the coefficient of the (3,-M) 2 term has been set to 1 by rescaling M. You would 
recognize (2) as the Euclidean version of the scalar field theory we have been studying. By 
dimensional analysis we see that 1 /*fa sets the length scale. More precisely, for T > T c , 
let us turn on a perturbing external magnetic field H(x) by adding the term —H ■ M. 
Assuming M small and minimizing G we obtain (—3 2 + a)M ~ H, with the solution 


M(x) = 


-W 

/ 


d 3 k e ik -$-y ) 

(2 7T) 3 k 2 + c 


-my) 


= d 3 y 


P -Va \x-y\ 


4n\x — y | 


H(y) 


(3) 


[Recall that we did the integral in (1.4.7)—admire the unity of physics!] 

It is standard to define a correlation function < M(x)M( 0) > by asking what the mag¬ 
netization M(x) will be if we use a magnetic field sharply localized at the origin to create 
a magnetization M( 0) there. We expect the correlation function to die off as e - l A 'IA over 
some correlation length t that goes to infinity as T approaches T c from above. The critical 
exponent v is traditionally defined by £ ~ 1 /(T — T c ) v . 

In Landau-Ginzburg theory, also known as mean field theory, we obtain ^ = 1 /^fa and 
hence v = \. 

The important point is not how well the predicted critical exponents such as /S and v 
agree with experiment but how easily they emerge from Landau-Ginzburg theory. The 
theory provides a starting point for a complete theory of critical phenomena, which was 
eventually developed by Kadanoff, Fisher, Wilson, and others using the renormalization 
group (to be discussed in chapter VI. 8 ). 
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The story goes that Landau had a logarithmic scale with which he ranked theoretical 
physicists, with Einstein on top, and that after working out Landau-Ginzburg theory he 
moved himself up by half a notch. 


Exercise 

v.3.1 Another important critical exponent y is defined by saying that the susceptibility x = (3 Af/3//) | ^ = o 
diverges ~ 1/| T — T c \ y as T approaches T c . Determine y in Landau-Ginzburg theory. [Hint: Instructively, 
there are two ways of doing it: (a) Add — H • M to (1) for M and H constant in space and solve for M{H). 
(b) Calculate the susceptibility function Xij( x — y) = [dM^x)/dHj(y)]\ H=0 and integrate over space.] 




Superconductivity 


Pairing and condensation 

When certain materials are cooled below a certain critical temperature T c , they suddenly be¬ 
come superconducting. Historically, physicists had long suspected that the superconduct¬ 
ing transition, just like the superfluid transition, has something to do with Bose-Einstein 
condensation. But electrons are fermions, not bosons, and thus they first have to pair 
into bosons, which then condense. We now know that this general picture is substantially 
correct: Electrons form Cooper pairs, whose condensation is responsible for superconduc¬ 
tivity. 

With brilliant insight, Landau and Ginzburg realized that without having to know the 
detailed mechanism driving the pairing of electrons into bosons, they could understand 
a great deal about superconductivity by studying the field ip(x) associated with these con¬ 
densing bosons. In analogy with the ferromagnetic transition in which the magnetization 
M(x) in a ferromagnet suddenly changes from zero to a nonzero value when the temper¬ 
ature drops below some critical temperature, they proposed that cp(x) becomes nonzero 
for temperatures below T c . (In this chapter x denotes spatial coordinates only.) In statisti¬ 
cal physics, quantities such as M(x) and (p(x) that change through a phase transition are 
known as order parameters. 

The field (p(x) carries two units of electric charge and is therefore complex. The dis¬ 
cussion now unfolds much as in chapter V.3 except that 3, should be replaced by Djcp = 
(3 i — i 2eAj)(p since cp is charged. Following Landau and Ginzburg and including the energy 
of the external magnetic field, we write the free energy as 

T =—F 2 . + \Dj<p\ 2 + a\(p\ 2 + -\(p\ A + • • • (1) 

4 J z 

which is clearly invariant under the U (1) gauge transformation <p —>■ e l2eA cp and A,- —> A,- + 
djA.As before, setting the coefficient of \Dj(p\ 2 equal to 1 just amounts to a normalization 
choice for <p. 

The similarity between (1) and (IV.6.1) should be evident. 
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Meissner effect 

A hallmark of superconductivity is the Meissner effect, in which an external magnetic 
field B permeating the material is expelled from it as the temperature drops below T c . This 
indicates that a constant magnetic field inside the material is not favored energetically. The 
effective laws of electromagnetism in the material must somehow change at T c . Normally, 
a constant magnetic field would cost an energy of the order ~ B 2 V , where V is the volume 
of the material. Suppose that the energy density is changed from the standard B 2 to A 2 
(where as usual VxA = B). For a constant magnetic field B, A grows as the distance and 
hence the total energy would grow faster than V. After the material goes superconducting, 
we have to pay an unacceptably large amount of extra energy to maintain the constant 
magnetic field and so it is more favorable to expel the magnetic field. 

Note that a term like A 2 in the effective energy density preserves rotational and transla¬ 
tional invariance but violates electromagnetic gauge invariance. But we already know how 
to break gauge invariance from chapter IV. 6 . Indeed, the U (1) gauge theory described there 
and the theory of superconductivity described here are essentially the same, related by a 
Wick rotation. 

As in chapter V.3 we suppose that for temperature T ~ T c , a ~ a\(T — T c ) while b 
remains positive. The free energy T is minimized by q> = 0 above T c , and by \<p\ — 
■sf—a/b = v below T c . All this is old hat to you, who have learned that upon symmetry 
breaking in a gauge theory the gauge field gains a mass. We simply read off from (1) that 

F =lF 2 + (2ev) 2 A 2 + ■ ■ ■ ( 2 ) 

which is precisely what we need to explain the Meissner effect. 


London penetration length and coherence length 

Physically, the magnetic field does not drop precipitously from some nonzero value outside 
the superconductor to zero inside, but drops over some characteristic length scale, called 
the London penetration length. The magnetic field leaks into the superconductor a bit over 
a length scale I , determined by the competition between the energy in the magnetic field 
F 2 ~ (3 A ) 2 ~ A 2 /l 2 and the Meissner term (lev) 2 A 2 in (2). Thus, Landau and Ginzburg 
obtained the London penetration length l L ~ (\/ev) — (\/e)*Jb/—a. 

Similarly, the characteristic length scale over which the order parameter <p varies is 
known as the coherence length 1^, which can be estimated by balancing the second and 
third terms in (1), roughly (3 cp ) 2 ~ <p 2 /l 2 and tup 2 against each other, giving a coherence 
length of order l v ~ 1 f-J—a. 

Putting things together, we have 

II 'fi> 


e 


(3) 
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You might recognize from chapter IV.6 that this is just the ratio of the mass of the scalar 
field to the mass of the vector field. 

As I remarked earlier, the concept of spontaneous symmetry breaking went from con¬ 
densed matter physics to particle physics. After hearing a talk at the University of Chicago 
on the Bardeen-Cooper-Schrieffer theory of superconductivity by the young Schrieffer, 
Nambu played an influential role in bringing spontaneous symmetry breaking to the par¬ 
ticle physics community. 


Exercises 

V.4.1 Vary ( 1 ) to obtain the equation for A and determine the London penetration length more carefully. 
V.4.2 Determine the coherence length more carefully. 




Peierls Instability 


Noninteracting hopping electrons 

The appearance of the Dirac equation and a relativistic field theory in a solid would be 
surprising indeed, but yes, it is possible. 

Consider the Hamiltonian 

H = -t XM+io+ c ) c j+d w 

j 

describing noninteracting electrons hopping on a 1-dimensional lattice (figure V.5.1). Here 
C; annihilates an electron on site j. Thus, the first term describes an electron hopping from 
site j to site j + 1 with amplitude t. We have suppressed the spin labels. This is just about 
the simplest solid state model; a good place to read about it is in Feynman’s “Freshman 
lectures.” 

Fourier transforming c ; - = e' ka ^c(k) (where a is the spacing between sites), we 
immediately find the energy spectrum e(k) = —21 cos ka (fig. V.5.2). Imposing a periodic 
boundary condition on a lattice with N sites, we have k — (In/Na)n with n an integer 
from — \N to \N. As N oo, k becomes a continuous rather than a discrete variable. As 
usual, the Brillouin zone is defined by — n/a < k <n/a. 

There is absolutely nothing relativistic about any of this. Indeed, at the bottom of the 
spectrum the energy (up to an irrelevant additive constant) goes as s(k) ~ lt\(ka) 2 = 
k 2 /2m e ff. The electron disperses nonrelativistically with an effective mass m e g-. 


j-i j j+i 


Figure V. 5.1 
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Figure V. 5.2 


But now let us fill the system with electrons up to some Fermi energy s F (see fig. 
V.5.2). Focus on an electron near the Fermi surface and measure its energy from e F 
and momentum from +k F . Suppose we are interested in electrons with energy small 
compared to s F , that is, E = e — s F ^.s F , and momentum small compared to k F , that is, 
p = k — k F <$; k F . These electrons obey a linear energy-momentum dispersion E = v F p 
with the Fermi velocity v F = (ds/dk)\ k=kp . We will call the field associated with these 
electrons \jr R , where the subscript indicates that they are “right moving.” It satisfies the 
equation of motion (3/3f + v F d/dx)ir R = 0. 

Similarly, the electrons with momentum around —k F obey the dispersion E — —v F p. 
We will call the field associated with these electrons \// L with L for “left moving,” satisfying 
( 3/3 1 — v F d/dx)i/f L = 0 . 


Emergence of the Dirac equation 

The Lagrangian summarizing all this is simply 



Introducing a 2-component field i/r = ^ 'f‘ R j > V 7 — ' K° — V f ^ cr 2> an( i choosing units so 

that v F = 1, we may write C more compactly as 



( 3 ) 


with y° — o 2 and y 1 = iaj satisfying the Clifford algebra {y^, y 1 ’} = 2 g^ v . 
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Amazingly enough, the (1 + 1)-dimensional Dirac Lagrangian emerges in a totally 
nonrelativistic situation! 


An instability 

I will now go on to discuss an important phenomenon known as Peierls’s instability. I will 
necessarily have to be a bit sketchy. I don’t have to tell you again that this is not a text on 
solid state physics, but in any case you will not find it difficult to fill in the gaps. 

Peierls considered a distortion in the lattice, with the ion at site j displaced from its 
equilibrium position by cos[#(ya)]. (Shades of our mattress from chapter I.l!) A lattice 
distortion with wave vector q = 2k F will connect electrons with momentum k F with 
electrons with momentum —kp. In other words, it connects right moving ones with 
left moving electrons, or in our field theoretic language i// R with i j/ L . Since the right 
moving electrons and the left moving electrons on the surface of the Fermi sea have the 
same energy (namely e f , duh!) we have the always interesting situation of degenerate 
perturbation theory: ^ ,°^ j + (^ q ) w dh eigenvalues s F ± 8. A gap opens at the surface 

of the Fermi sea. Here 8 represents the perturbation. Thus, Peierls concluded that the 
spectrum changes drastically and the system is unstable under a perturbation with wave 
vector 2k F . 

A particularly interesting situation occurs when the system is half filled with electrons 
(so that the density is one electron per site—recall that electrons have up and down spin). 
In other words, kp — it 12a and thus 2k F — tx/ a. A lattice distortion of the form shown in 
figure V.5.3 has precisely this wave vector. Peierls showed that a half-filled system would 
want to distort the lattice in this way, doubling the unit cell. It is instructive to see how this 
physical phenomenon emerges in a field theoretic formulation. 

Denote the displacement of the ion at site j by d - t . In the continuum limit, we should be 
able to replace cl j by a scalar field. Show that a perturbation connecting i fr R and \[r L couples 
to i jnjf and \jry^ijs, and that a linear combination xj/xf/ and i ]ry 5 x[r can always be rotated to 
x/cx/f by a chiral transformation (see exercise V.5.1.) Thus, we extend (3) to 

C = iriy^ir + {[(d,(p) 2 - v\d x <p) 2 ] - ±/xV + g<P* f + ' • • (4) 

Remember that you worked out the effective potential V e ff(<p) of this (1 + 1 (-dimensional 
field theory in exercise IV.3.2: V e ff (cp) goes as cp 2 log <p 2 for small cp, which overwhelms the 
i/x 2 y? 2 term. Thus, the symmetry cp —> —cp is dynamically broken. The field <p acquires a 
vacuum expectation value and i fr becomes massive. In other words, the electron spectrum 
develops a gap. 


Figure V.5.3 
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Exercise 

v.5.1 Parallel to the discussion in chapter II.l you can see easily that the space of 2 by 2 matrices is spanned by 
the four matrices I, , and y 5 = y°y 1 = cr 3 . (Note the peculiar but standard notation of y 5 .) Convince 

yourself that \ (/ ± y 5 ) projects out right- and left handed fields just as in (3 + 1)-dimensional spacetime. 
Show that in the bilinear xlry^xf/ left handed fields are connected to left handed fields and right handed 
to right handed and that in the scalar xj/x/f and the pseudoscalar xjry^xj/ right handed is connected to 
left handed and vice versa. Finally, note that under the transformation x[/ —> e ldyS xJ/ the scalar and the 
pseudoscalar rotate into each other. Check that this transformation leaves the massless Dirac Lagrangian 
(3) invariant. 



Solitons 


V.6 


Breaking the shackles of Feynman diagrams 

When I teach quantum field theory I like to tell the students that by the mid-1970s field 
theorists were breaking the shackles of Feynman diagrams. A bit melodramatic, yes, 
but by that time Feynman diagrams, because of their spectacular successes in quantum 
electrodynamics, were dominating the thinking of many field theorists, perhaps to excess. 
As a student I was even told that Feynman diagrams define quantum field theory, that 
quantum fields were merely the “slices of venison” 1 used to derive the Feynman rules, and 
should be discarded once the rules were obtained. The prevailing view was that it barely 
made sense to write down (p(x). This view was forever shattered with the discovery of 
topological solitons, as we will now discuss. 


Small oscillations versus lumps 

Consider once again our favorite toy model C = \ (dcp) 2 — V(cp) with the infamous double¬ 
well potential V(cp) — (A/4 )(tp 2 — v 2 ) 2 in (1 + l)-dimensional spacetime. In chapter IV.l 
we learned that of the two vacua ip = ±v we are to pick one and study small oscillations 
around it. So, pick one and write tp = v + x > expand £ in x , and study the dynamics of 
the x meson with mass /x = (Xv 2 )?. Physics then consists of suitably quantized waves 
oscillating about the vacuum v. 

But that is not the whole story. We can also have a time independent field configuration 
with <p(x) (in this and the next chapter x will denote only space unless it is clearly meant 
to be otherwise from the context) taking on the value — v as x —> — oo and +v as x -> +oo, 
and changing from —v to +i> around some point x 0 over some length scale / as shown in 

1 Gell-Mann used to speak about how pheasant meat is cooked in France between two slices of venison which 
are then discarded. He forcefully advocated a program to extract and study the algebraic structure of quantum 
field theories which are then discarded. 
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Figure V.6.1 


figure V.6.1a. [Note that if we consider the Euclidean version of the field theory, identify 
the time coordinate as the y coordinate, and think of cp(x, y ) as the magnetization (as in 
chapter V.3), then the configuration here describes a “domain wall” in a 2-dimensional 
magnetic system.] 

Think about the energy per unit length 


e(x) = 


1 

2 



2,2 


+ - Or - v) 

4 


( 1 ) 


for this configuration, which I plot in figure V.6. lb. Far away from v 0 we are in one of the 
two vacua and there is no energy density. Near xq, the two terms in e(x) both contribute to 
the energy or mass M = f dx e(x): the “spatial variation” (in a slight abuse of terminology 
often called the “kinetic energy”) term f dx \ ( dcp/dx) 2 ~ / (v/l) 2 ~ v 2 /1 , and the “potential 
energy” term f dxX(yp 2 — v 2 ) 2 ~ IXv 4 . To minimize the total energy the spatial variation 
term wants I to be large, while the potential term wants I to be small. The competition 
dM/dl — 0 gives v 2 /I ~ IXv 4 , thus fixing / ~ (Xv 2 )~^ ~ 1/fi. The mass comes out to be 
~ /x v 2 ~ ix(p?/X). 

We have a lump of energy spread over a region of length / of the order of the Compton 
wavelength of the y meson. By translation invariance, the center of the lump x 0 can be 
anywhere. Furthermore, since the theory is Lorentz invariant, we can always boost to 
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send the lump moving at any velocity we like. Recalling a famous retort in the annals 
of American politics (“It walks like a duck, quacks like a duck, so Mr. Senator, why don’t 
you want to call it a duck?”) we have here a particle, known as a kink or a soliton, with 
mass ~ fx.(/x 2 /X) and size ~ /. Perhaps because of the way the soliton was discovered, 
many physicists think of it as a big lumbering object, but as we have seen, the size of a 
soliton/ ~ l//xcanbemadeas small as we like by increasing fi. So a soliton could look like 
a point particle. We will come back to this point in chapter VI.3 when we discuss duality. 


Topological stability 

While the kink and the meson are the same size, for small X the kink is much more 
massive than the meson. Nevertheless, the kink cannot decay into mesons because it costs 
an infinite amount of energy to undo the kink [by “lifting” q>(x) over the potential energy 
barrier to change it from +u to — v for x from some point x 0 to +oo, for example]. The 
kink is said to be topologically stable. 

The stability is formally guaranteed by the conserved current 

J M = ^ v d v q> (2) 

2d 

with the charge 

/ +oo 1 

dxj°(x ) = —[y>(+oo) — <p{— oo)] 

-oo 2 d 

Mesons, which are small localized packets of oscillations in the field clearly have <2 = 0, 
while the kink has Q — 1. Thus, the kink cannot decay into a bunch of mesons. Incidentally, 
the charge density J° — (1 /2v)(d(p/dx) is concentrated at v 0 where ip changes most rapidly, 
as you would expect. 

Note that — 0 follows immediately from the antisymmetric symbol s^ v and does 
not depend on the equation of motion. The current J 11 is known as a “topological current.” 
Its existence does not follow from Noether’s theorem (chapter 1.10) but from topology. 

Our discussion also makes clear the existence of an antikink with Q = — 1 and described 
by a configuration with <p(—oo) = +d and <p(+oo) = —v. The nameis justified by consider¬ 
ing the configuration pictured in figure V.6.2 containing a kink and an antikink far apart. 
As the kink and the antikink move closer to each other, they clearly can annihilate into 
mesons, since the configuration shown in figure V.6.2 and the vacuum configuration with 
(p(x) = +v everywhere are separated by a finite amount of energy. 


A non perturbative phenomenon 

That the mass of the kink comes out inversely proportional to the coupling A is a clear sign 
that field theorists could have done perturbation theory in X till they were blue in the face 
without ever discovering the kink. Feynman diagrams could not have told us about it. 
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Antikink Kink 

Figure V.6.2 


You can calculate the mass of a lcinlc by minimizing 


M = / dx 


f dx\-(^) +-U 2 -v 2 ) 2 
J |_ 2\dx) 4 V ) 

- C— v z - ^ rt i^y+ i t/ 2 -i) 2 

\ X ) J \2\dyJ 4 V / 


where in the last step we performed the obvious scaling q>(x) -> vf(y) and y — fix. This 
scaling argument immediately showed that the mass of the kink M — a(ix 2 /X)/x with 
a a pure number: The heuristic estimate of the mass proved to be highly trustworthy. 
The actual function (p(x.) and hence a can be computed straightforwardly with standard 
variational methods. 


Bogomol’nyi inequality 

More cleverly, observe that the energy density (1) is the sum of two squares. Using 
a 2 + b 2 > 2\ab\ we obtain 



We have the elegant result 

M>\Q\ (3) 

with mass M measured in units of (4/3\/2)q (/x 2 /k). This is an example of a Bogomol’nyi 
inequality, which plays an important role in string theory. 


Exercises 

V. 6.1 Show that if i p(x) is a solution of the equation of motion, then so is 1 p[(x — vi)/ ~J\ — v 2 ]. 

V. 6.2 Discuss the solitons in the so-called sine-Gordon theory C = \ (dip) 2 — g cos(fSip). Find the topological 

current. Is the Q = 2 soliton stable or not? 

V. 6.3 Compute the mass of the kink by the brute force method and check the result from the Bogomol’nyi 
inequality. 



Vortices, Monopoles, and Instantons 


Vortices 

The lcinlc is merely the simplest example of a large class of topological objects in quantum 
field theory. 

Consider the theory of a complex scalar field in (2 + l)-dimensional spacetime C — 
dip 1 dip — X(ip ip — v 2 ) 2 with the now familiar Mexican hat potential. With some minor 
changes in notation, this is the theory we used to describe interacting bosons and su¬ 
perfluids. (We choose to study the relativistic rather than the nonrelativistic version but as 
you will see the issue does not enter for the questions I want to discuss here.) 

Are there solitons, that is, objects like the kink, in this theory? 

Given some time-independent configuration ip{x) let us look at its mass or energy 

M = J c^x^ip 1 d t ip + Xiy ip — v 2 ) 2 ]. (1) 

The integrand is a sum of two squares, each of which must give a finite contribution. In 
particular, for the contribution of the second term to be finite the magnitude of ip must 
approach v at spatial infinity. 

This finite energy requirement does not fix the phase of ip however. Using polar coordi¬ 
nates (r , 9) we will consider the Ansatz ip — > ve' 6 . Writing ip = ip 1 + iip 2 , we see that the 

r—>-oo 

vector (ip lt ip 2 ) = v(cos 9, sin 9) points radially outward at infinity. Recall the definition of 
the current /, = i (djip f tp — ip'djip) in a bosonic fluid given in chapter III.5. The flow whirls 
about at spatial infinity, and thus this configuration is known as the vortex. 

By explicit differentiation or dimensional analysis, we have djip ~ v(l/r) as r -» oo. Now 
look at the first term in M. Oops, the energy diverges logarithmically as v 2 f d 2 x(\/r 2 ). 

Is there a way out? Not unless we change the theory. 
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Vortex into flux tubes 


Suppose we gauge the theory by replacing d t (p by D t (p — djtp — ieAjCp. Now we can achieve 
finite energy by requiring that the two terms in D i tp knock each other out so that D i <p —> 0 

r —>00 

faster than 1/r. In other words. A, —>• — (i /e){l/\(p\ 2 )(p ( dj(p — (1/e)3 ,0. Immediately, we 

r^-00 

have 


Flux = J d 2 xFi 2 = j) d.XjAj = — (2) 

where C is an infinitely large circle at spatial infinity and we have used Stokes’ theorem. 
Thus, in a gauged 17(1) theory the vortex carries a magnetic flux inversely proportional to 
the charge. When I say magnetic, I am presuming that A represents the electromagnetic 
gauge potential. The vortex discussed here appears as a flux tube in so-called type II 
superconductors. It is worth remarking that this fundamental unit of flux (2) is normally 
written in the condensed matter physics literature in unnatural units as 



very pleasingly uniting three fundamental constants of Nature. 


(3) 


Homotopy groups 

Since spatial infinity in 2 dimensional space is topologically a unit circle S 1 and since 
the field configuration with \q>\ = v also forms a circle .S' 1 , this boundary condition can 
be characterized as a map S 1 -» S 1 . Since this map cannot be smoothly deformed into the 
trivial map in which S 1 is mapped onto a point in S 1 , the corresponding field configuration 
is indeed topologically stable. (Think of wrapping a loop of string around a ring.) 

Mathematically, maps of S" into a manifold M are classified by the homotopy group 
n„ (M), which counts the number of topologically inequivalent maps. You can look up the 
homotopy groups for various manifolds in tables. 1 In particular, for n > 1, Vl n (S n ) = Z, 
where Z is the mathematical notation for the set of all integers. The simplest example 
n 1 (5’ 1 ) = Z is proved almost immediately by exhibiting the maps <p —> ve " n0 , with m 

r^-00 

any integer (positive or negative), using the context and notation of our discussion for 
convenience. Clearly, this map wraps one circle around the other m times. 

The language of homotopy groups is not just to impress people, but gives us a unifying 
language to discuss topological solitons. Indeed, looking back you can now see that the 
kink is a physical manifestation of floIS 0 ) = Z 2 , where Z 2 denotes the multiplicative group 
consisting of {+1, —1} (since the O-dimensional sphere S° = {+1, —1} consists of just two 
points and is topologically equivalent to the spatial infinity in 1-dimensional space). 


1 See tables 6.V and 6.VI, S. Iyanaga and Y. Kawada, eds., Encyclopedic Dictionary of Mathematics, p. 1415. 
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Hedgehogs and monopoles 


If you absorbed all this, you are ready to move up to (3 + l)-dimensional spacetime. Spatial 
infinity is now topologically S 2 . By now you realize that if the scalar field lives on the 
manifold M, then we have at infinity the map of S 2 —> M. The simplest choice is thus to 
take S 2 for M. Hence, we are led to scalar fields <p a (a = 1,2,3) transforming as a vector ip 
under an internal symmetry group O (3) and governed by £ = \dip ■ dip — V{ip ■ ip). (There 
should be no confusion in using the arrow to indicate a vector in the internal symmetry 
group.) 

Let us choose V — X(ip 2 — v 2 ) 2 . The story unfolds much as the story of the vortex. The 
requirement that the mass of a time independent configuration 


M = J dh[\(dip) 2 + \{ip 2 - v 2 ) 2 } (4) 

be finite forces \ip\ = v at spatial infinity so that ip(r = oo) indeed lives on S 2 . 

The identity map S 2 -> S 2 indicates that we should consider a configuration such that 



This equation looks a bit strange at first sight since it mixes the index of the internal 
symmetry group with the index of the spatial coordinates (but in fact we have already 
encountered this phenomenon in the vortex). At spatial infinity, the field ip is pointing 
radially outward, so this configuration is known picturesquely as a hedgehog. Draw a 
picture if you don’t get it! 

As in the vortex story, the requirement that the first term in (4) be finite forces us to 
introduce an 0(3) gauge potential A b so that we can replace the ordinary derivative d,(p a 
by the covariant derivative D t (p a — dj(p a + es abc A b (p c . We can then arrange Dj(p a to vanish 
at infinity. Simple arithmetic shows that with (5) the gauge potential has to go as 



( 6 ) 


Imagine yourself in a lab at spatial infinity. Inside a small enough lab, the ip field at 
different points are all pointing in approximately the same direction. The gauge group 
0(3) is broken down to 0(2) ~ U( 1). The experimentalists in this lab observe a massless 
gauge field associated with the 17(1), which they might as well call the electromagnetic 
field (“quacks like a duck”). Indeed, the gauge invariant tensor field 


\<P\ 


s abc v a (D ll cp) b (D vV Y 

e\(p\ 3 


(7) 


can be identified as the electromagnetic field (see exercise V.7.5). 

There is no electric field since the configuration is time independent and A b Q — 0. We 
can only have a magnetic field B which you can immediately calculate since you know A b , 
but by symmetry we already see that B can point only in the radial direction. 

This is the fabled magnetic monopole first postulated by Dirac! 
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The presence of magnetic monopoles in spontaneously broken gauge theory was dis¬ 
covered by’t Hooft and Polyakov. If you calculate the total magnetic flux coming out of the 
monopole f dS ■ B, where as usual dS denotes a small surface element at infinity pointing 
radially outward, you will find that it is quantized in suitable units, exactly as Dirac had 
stated, as it must (recall chapter IV.4). 

We can once again write a Bogomol’nyi inequality for the mass of the monopole 

M = f d\ [l (F y ) 2 + + V(£)] (8) 

[F^ transforms as a vector under 0(3); recall (IV.5.17).] Observe that 

\{Fiji 1 + \{Di<P ) 2 = l(Fij ± e iJk D k cp ) 2 t \e ijk Fij ' D k<P 

Thus 

M>J d\x [t ■ D k f + V(£)] (9) 

We next note that 

J d i XjS i j k F j j-D k (p = J d i x\e i j k 'd k (F i j ■ ip) = v J dS-B = Anvg 

has an elegant interpretation in terms of the magnetic charge g of the monopole. Further¬ 
more, if we can throw away V (<p) while keeping \cp\ —> v, then the inequality M > Art v\g\ 

r—>oo 

is saturated by F t] — ±Sjj k D k (p. The solutions of this equation are known as Bogomol’nyi- 
Prasad-Sommerfeld or BPS states. 

It is not difficult to construct an electrically charged magnetic monopole, known as a 
dyon. We simply take A q = (x b /r)f(r) with some suitable function f(r). 

One nice feature of the topological monopole is that its mass comes out to be ~ M w la ~ 
13>7 M w (exercise V.7.11), where M w denotes the mass of the intermediate vector boson of 
the weak interaction. We are anticipating chapter VII.2 a bit in that the gauge boson that 
becomes massive by the Anderson-Higgs mechanism of chapter IV.6 may be identified 
with the intermediate vector boson. This explains naturally why the monopole has not yet 
been discovered. 


Instanton 


Consider a nonabelian gauge theory, and rotate the path integral to 4-dimensional Eu¬ 
clidean space. We might wish to evaluate Z = f DAe~ S[A} in the steepest descent approx¬ 
imation, in which case we would have to find the extrema of 


S(A) = 


/ 


d x— tr F F 
7g 


with finite action. This implies that at infinity |x \ — 00 , F llv must vanish faster than 1/1 x \ 2 , 
and so the gauge potential A M must be a pure gauge: A — gdg ' for g an element of the 
gauge group [see (IV.5.6)]. Configurations for which this is true are known as instantons. 
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We see that the instanton is yet one more link in the “great chain of being”: lcinlc-vortex- 
monopole-instanton. 

Choose the gauge group SU (2) to be definite. In the parametrization g = x 4 + ix - a 
we have by definition g' f g — 1 and det g — 1, thus implying x\ + x 2 — 1. We learn that 
the group manifold of SU (2) is S 3 . Thus, in an instanton, the gauge potential at infinity 
A — gdg f + 0( l/|x| 2 ) defines a map S 3 -» S 3 . Sound familiar? Indeed, you’ve already 

|x|-»oo 

seen S° -> S°, S’ 1 —> S 1 , S 2 -> S 2 playing a role in field theory. 

Recall from chapter IV.5 that tr F 2 — d tr (AdA + \ A 3 ). Thus 

J tr F 2 = J tr (Ad A + |A 3 ) = J tr (AF - lA 3 ) = ~\ j s tr(gdg r ) 3 (10) 

where we used the fact that F vanishes at infinity. This shows explicitly that / tr F 2 
depends only on the homotopy of the map S 3 -» S 3 defined by g and is thus a topological 
quantity. Incidentally, f s3 tr(gdg f) 3 is known to mathematicians as the Pontryagin index 
(see exercise V.7.12). 

I mentioned in chapter IV.7 that the chiral anomaly is not affected by higher-order 
quantum fluctuations. You are now in position to give an elegant topological proof of this 
fact (exercise V.7.13). 


Kosterlitz-Thouless transition 

1" "I" 

We were a bit hasty in dismissing the vortex in the nongauged theory C — d(p d <p — X(cp ip — 
v 2 ) 2 in (2 + 1)-dimensional spacetime. Around a vortex (p ~ ve' e and so it is true, as we 
have noted, that the energy of a single vortex diverges logarithmically. But what about a 
vortex paired with an anti vortex? 

Picture a vortex and an antivortex separated by a distance R large compared to the 
distance scales in the theory. Around an antivortex (p ~ ve~' e . The field <p winds around 
the vortex one way and around the antivortex the other way. Convince yourself by drawing 
a picture that at spatial infinity (p does not wind at all: It just goes to a fixed value. The 
winding one way cancels the winding the other way. 

Thus, a configuration consisting of a vortex-antivortex pair does not cost infinite energy. 
But it does cost a finite amount of energy: In the region between the vortex and the 
antivortex (p is winding around, in fact roughly twice as fast (as you can see by drawing a 
picture). A rough estimate of the energy is thus v 2 f d 2 x(\/r 2 ) ~ v 2 log (R/a), where we 
integrated over a region of size R , the relevant physical scale in the problem. (To make 
sense of the problem we divide R by the size a of the vortex.) The vortex and the antivortex 
attract each other with a logarithmic potential. In other words, the configuration cannot 
be static: The vortex and the antivortex want to get together and annihilate each other in 
a fiery embrace and release that finite amount of energy i> 2 log (R/a). (Hence the term 
antivortex.) 

All of this is at zero temperature, but in condensed matter physics we are interested in 
the free energy F = E — T S (with S the entropy) at some temperature T rather than the 
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energy E. Appreciating this elementary point, Kosterlitz and Thouless discovered a phase 
transition as the temperature is raised. Consider a gas of vortices and antivortices at some 
nonzero temperature. Because of thermal agitation, the vortices and antivortices moving 
around may or may not find each other to annihilate. How high do we have to crank up 
the temperature for this to happen? 

Let us do a heuristic estimate. Consider a single vortex. Herr Boltzmann tells us that the 
entropy is the logarithm of the “number” of ways in which we can put the vortex inside 
a box of size L (which we will let tend to infinity). Thus, S ~ log (L/a). The entropy S is 
to battle the energy E ~ v 2 log (L/a). We see that the free energy F ~ (i> 2 — T) log (L/a) 
goes to infinity if T ;$ v 2 , which we identify as essentially the critical temperature T c . 

A single vortex cannot exist below T c . Vortices and antivortices are tightly bound below 
T c but are liberated above T c . 


Black hole 

The discovery in the 1970s of these topological objects that cannot be seen in perturbation 
theory came as a shock to the generation of physicists raised on Feynman diagrams and 
canonical quantization. People (including yours truly) were taught that the field operator 
q>(x) is a highly singular quantum operator and has no physical meaning as such, and 
that quantum field theory is defined perturbatively by Feynman diagrams. Even quite 
eminent physicists asked in puzzlement what a statement such as <p —> ve' e would mean. 

r—>-oo 

Learned discussions that in hindsight are totally irrelevant ensued. As I said in introducing 
chapter V.6,1 like to refer to this historical process as “field theorists breaking the shackles 
of Feynman diagrams.” 

It is worth mentioning one argument physicists at that time used to convince themselves 
that solitons do exist. After all, the Schwarzschild black hole, defined by the metric g^ v (x) 
(see chapter 1.11), had been known since 1916. Just what are the components of the metric 
g flv (x)? They are fields in exactly the same way that our scalar field <p(x) and our gauge 
potential A (x) are fields, and in a quantum theory of gravity g would have to be replaced 
by a quantum operator just like <p and A^. So the objects discovered in the 1970s are 
conceptually no different from the black hole known in the 1910s. But in the early 1970s 
most particle theorists were not particularly aware of quantum gravity. 


Exercises 

V- 7.1 Explain the relation between the mathematical statement n 0 (S°) = Z 2 and the physical result that there 
are no kinks with \Q\ >2. 

V. 7.2 In the vortex, study the length scales characterizing the variation of the fields (p and A. Estimate the mass 
of the vortex. 
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v.7.3 Consider the vortex configuration in which <p —* ve lvd , with v an integer. Calculate the magnetic flux. 
Show that the magnetic flux coming out of an antivortex (for which v = — 1) is opposite to the magnetic 
flux coming out of a vortex. 

V. 7.4 Mathematically, since g(0) = e lve may be regarded as an element of the group U (1), we can speak of 
a map of S' 1 , the circle at spatial infinity, onto the group U (1). Calculate (i /2 n) f sl gdg f, thus showing 
that the winding number is given by this integral of a 1-form. 

V.7.5 Show that within a region in which (p a is constant, as defined in the text is the electromagnetic field 
strength. Compute B far from the center of a magnetic monopole and show that Dirac quantization 
holds. 

V. 7.6 Display explicitly the map S 2 — > S 2 , which wraps one sphere around the other twice. Verify that this map 

corresponds to a magnetic monopole with magnetic charge 2. 

V.7.7 Write down the variational equations that minimize (8). 

V. 7.8 Find the BPS solution explicitly. 

V.7.9 Discuss the dyon solution. Work it out in the BPS limit. 

V. 7 .io Verify explicitly that the magnetic monopole is rotation invariant in spite of appearances. By this is 
meant that all physical gauge invariant quantities such as B are covariant under rotation. Gauge variant 
quantities such as A b can and do vary under rotation. Write down the generators of rotation. 

V. 7.11 Show that the mass of the magnetic monopole is about 137 M w . 

V. 7.12 Evaluate n = —(1/24 n 2 ) f si tr (gdg^) 3 for the map g = e ld ‘°. [Hint: By symmetry, you need calculate the 
integrand only in a small neighborhood of the identity element of the group or equivalently the north 
pole of S 3 . Next, consider g = for m an integer and convince yourself that m measures 

the number of times S 3 wraps around S' 3 .] Compare with exercise V.7.4 and admire the elegance of 
mathematics. 

V.7.13 Prove that higher order corrections do not change the chiral anomaly — [1/(47 z) 2 ]s^ v2 ' cr tr F^ v F^ a 
(I have rescaled A —> (1 /g)A). [Hint: Integrate over spacetime and show that the left hand side is given 
by the number of right-handed fermion quanta minus the number of left-handed fermion quanta, so 
that both sides are given by integers.] 
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Fractional Statistics, Chern-Simons Term, and 
Topological Field Theory 


Fractional statistics 

The existence of bosons and fermions represents one of the most profound features 
of quantum physics. When we interchange two identical quantum particles, the wave 
function acquires a factor of either +1 or —1. Leinaas and Myrheim, and later Wilczelc 
independently, had the insight to recognize that in (2 + l)-dimensional spacetime particles 
can also obey statistics other than Bose or Fermi statistics, a statistics now known as 
fractional or anyon statistics. These particles are now known as anyons. 

To interchange two particles, we can move one of them half-way around the other and 
then translate both of them appropriately. When you take one anyon half-way around 
another anyon going anticlockwise, the wave function acquires a factor of e' e where 6 
is a real number characteristic of the particle. For 0 — 0, we have bosons and for 0 — n, 
fermions. Particles half-way between bosons and fermions, with 0 = n/2, are known as 
semions. 

After Wilczek’s paper came out, a number of distinguished senior physicists were thor¬ 
oughly confused. Thinking in terms of Schrodinger’s wave function, they got into endless 
arguments about whether the wave function must be single valued. Indeed, anyon statis¬ 
tics provides a striking example of the fact that the path integral formalism is sometimes 
significantly more transparent. The concept of anyon statistics can be formulated in terms 
of wave functions but it requires thinking clearly about the configuration space over which 
the wave function is defined. 

Consider two indistinguishable particles at positions x[ and x' 2 at some initial time 
that end up at positions x{ and x 2 a time T later. In the path integral representation 
for (x (, x 2 | e~ lHT \x\, x l 2 ) we have to sum over all paths. In spacetime, the worldlines of 
the two particles braid around each other (see fig. VI.1.1). (We are implicitly assuming 
that the particles cannot go through each other, which is the case if there is a hard core 
repulsion between them.) Clearly, the paths can be divided into topologically distinct 
classes, characterized by an integer n equal to the number of times the worldlines of the 
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Figure VI.1.1 


two particles braid around each other. Since the classes cannot be deformed into each 
other, the corresponding amplitudes cannot interfere quantum mechanically, and with 
the amplitudes in each class we are allowed to associate an additional phase factor 
beyond the usual factor coming from the action. 

The dependence of a n on n is determined by how quantum amplitudes are to be 
combined. Suppose one particle goes around the other through an angle A <py a history to 
which we assign an additional phase factor e'^ < ' A(pi) with / some as yet unknown function. 
Suppose this history is followed by another history in which our particle goes around the 
other by an additional angle A (p 2 - The phase factor e'^ Avi+A<Pl ' > we assign to the combined 
history clearly has to satisfy the composition law e , /( A ^i+ A *’ 2 ) = i n other 

words, f(Acp ) has to be a linear function of its argument. 

We conclude that in (2 + 1)-dimensional spacetime we can associate with the quantum 
amplitude corresponding to paths in which one particle goes around the other anti¬ 
clockwise through an angle a phase factor e ' (0 / 7r ) A ¥ , ) with 9 an arbitrary real parameter. 
Note that when one particle goes around the other clockwise through an angle A <f> the 
quantum amplitude acquires a phase factor e~ l ^ A<p . 

When we interchange two anyons, we have to be careful to specify whether we do it 
“anticlockwise” or “clockwise,” producing factors e ,(l and e ~' e , respectively. This indicates 
immediately that parity P and time reversal invariance T are violated. 


Chern-Simons theory 

The next important question is whether all this can be incorporated in a local quantum field 
theory. The answer was given by Wilczek and Zee, who showed that the notion of fractional 
statistics can result from the effect of coupling to a gauge potential. The significance of 
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a field theoretic formulation is that it demonstrates conclusively that the idea of anyon 
statistics is fully compatible with the cherished principles that we hold dear and that go 
into the construction of quantum field theory. 

Given a Lagrangian £ 0 with a conserved current j construct the Lagrangian 

£ = d 0 + ys^a^ax + a ^ (1) 

Here s 111,1 denotes the totally antisymmetric symbol in (2 + 1)-dimensional spacetime and 
y is an arbitrary real parameter. Under a gauge transformation -» + 3^A, the term 

s^a^d^x, known as the Chern-Simons term, changes by s^a^dyQx —>■ s IJ ' vl a ll d v a } + 
Ad v a x - The action changes by SS = y f d^xe^d^iAdyax) and thus, if we are 
allowed to drop boundary terms, as we assume to be the case here, the Chern-Simons 
action is gauge invariant. Note incidentally that in the language of differential form you 
learned in chapters IV.5 and IV.6 the Chern-Simons term can be written compactly as acla. 

Let us solve the equation of motion derived from (1): 

2 ye^ k d v a x = -j " ( 2 ) 

for a particle sitting at rest (so that j, = 0). Integrating the [x , = 0 component of (2), we 
obtain 

J d 1 x(d 1 a 2 -d 2 a l ) = -^- j d 2 xj° (3) 

Thus, the Chern-Simons term has the effect of endowing the charged particles in the 
theory with flux. (Here the term charged particles simply means particles that couple to 
the gauge potential a^. In this context, when we refer to charge and flux, we are obviously 
not referring to the charge and flux associated with the ordinary electromagnetic field. We 
are simply borrowing a useful terminology.) 

By the Aharonov-Bohm effect (chapter IV.4), when one of our particles moves around 
another, the wave function acquires a phase, thus endowing the particles with anyon 
statistics with angle 9 = 1/4 y (see exercise VI.1.5). 

Strictly speaking, the term “fractional statistics” is somewhat misleading. First, a trivial 
remark: The statistics parameter 6 does not have to be a fraction. Second, statistics is 
not directly related to counting how many particles we can put into a state. The statistics 
between anyons is perhaps better thought of as a long ranged phase interaction between 
them, mediated by the gauge potential a. 

The appearance of s^ LvX in (1) signals the violation of parity P and time reversal invari¬ 
ance T, something we already know. 


Hopf term 

An alternative treatment is to integrate out a in (1). As explained in chapter III.4, and as 
in any gauge theory, the inverse of the differential operator sd is not defined: It has a zero 
mode since (s^d^idxF (x)) — 0 for any smooth function F(x). Let us choose the Lorenz 
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gauge = 0. Then, using the fundamental identity of field theory (see appendix A) we 
obtain the nonlocal Lagrangian 

„ _ 1 ( . e» v % . \ ... 

^ H °P f 4 y y 7 ' 1 g2 Jx J l 4 ) 

known as the Hopf term. 

To determine the statistics parameter 0 consider a history in which one particle moves 
half-way around another sitting at rest. The current j is then equal to the sum of two 
terms describing the two particles. Plugging into (4) we evaluate the quantum phase 
e' s — e f d A£H °p f and obtain 6 = 1/(4 y). 


Topological field theory 

There is something conceptually new about the pure Chern-Simons theory 

S — y f d 3 jce' tv V9 u a x (5) 

JM 

It is topological. 

Recall from chapter 1.11 that a field theory written in flat spacetime can be immediately 
promoted to a field theory in curved spacetime by replacing the Minkowski metric r]^ v by 
the Einstein metric g pv and including a factor ^/—g in the spacetime integration measure. 

But in the Chern-Simons theory ij^ v does not appear! Lorentz indices are contracted 
with the totally antisymmetric symbol e /JvA . Furthermore, we don’t need the factor y/—g, 
as I will now show. Recall also from chapter 1.11 that a vector field transforms as a^{x) — 
(dx' k /dx^a^ix') and so for three vector fields 

r\r ,a r\r fr rlr'P 

B 11 a (x)b v {x)c^(x) = ^ vX —~ — —a'(x')b'(x')c'(x') 
dxP- ax v ox A y 

= det j e aTp a' a (x')b' T (x')c' p (x') 

On the other hand, d}x’ = d?x det(3x73-v). Observe, then, 

d i xe livl a ljL (x)b v (x)c l (x ) = d 3 x' s arp a' a (x')b' T (x')c' p (x') 

which is invariant without the benefit of ^—g- 

So, the Chern-Simons action in (5) is invariant under general coordinate transforma¬ 
tion—it is already written for curved spacetime. The metric g /lv does not enter anywhere. 
The Chern-Simons theory does not know about clocks and rulers! It only knows about the 
topology of spacetime and is rightly known as a topological field theory. In other words, 
when the integral in (5) is evaluated over a closed manifold M the property of the field 
theory / Dae lS ^ depends only on the topology of the manifold, and not on whatever 
metric we might put on the manifold. 
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Ground state degeneracy 

Recall from chapter 1.11 the fundamental definition of energy and momentum. The 
energy-momentum tensor is defined by the variation of the action with respect to g^ v , but 
hey, the action here does not depend on g llv . The energy-momentum tensor and hence the 
Hamiltonian is identically zero! One way of saying this is that to define the Hamiltonian 
we need clocks and rulers. 

What does it mean for a quantum system to have a Hamiltonian H — 0? Well, when we 
took a course on quantum mechanics, if the professor assigned an exam problem to find 
the spectrum of the Hamiltonian 0, we could do it easily! All states have energy E — 0. We 
are ready to hand it in. 

But the nontrivial problem is to find how many states there are. This number is known 
as the ground state degeneracy and depends only on the topology of the manifold M. 


Massive Dirac fermions and the Chern-Simons term 


Consider a gauge potential a^ coupled to a massive Dirac fermion in (2 + 1)-dimensional 
spacetime: C — \Jr(i$ + <j — m) i/f. You did an exercise way back in chapter II. 1 discovering 
the rather surprising phenomenon that in (2 + 1)-dimensional spacetime the Dirac mass 
term violates P and T . (What? You didn’t do it? You have to go back.) Thus, we would expect 
to generate the P and T violating the Chern-Simons texms IJ ' vX a fl d x a v if we integrate out the 
fermion to get the term tr log(; ^ + <p — m) in the effective action, along the lines discussed 
in chapter IV. 3. 

In one-loop order we have the vacuum polarization diagram (diagrammatically exactly 
the same as in chapter 111.7 but in a spacetime with one less dimension) with a Feynman 
integral proportional to 

f ———^ tr (y v - - - ) (6) 

J (2?r) 3 \ p + rf - m p — mj 

As we will see, the change 4 —> 3 makes all the difference in the world. I leave it to you 
to evaluate (6) in detail (exercise VI. 1.7) but let me point out the salient features here. 
Since the d x in the Chern-Simons term corresponds to q x in momentum space, in order 
to identify the coefficient of the Chern-Simons term we need only differentiate (6) with 
respect to q x and set q -> 0 : 


/ 


d 3 p 

(iH 3 


tr(y u 


p — m p — m p — m 


f d^p tr [y v ( p + m)y x ( p + m)y^( p + m)} 
J (27r) 3 (p 2 — m 2 ) 3 


( 7 ) 
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I will simply focus on one piece of the integral, the piece coming from the term in the 
trace proportional to m 3 : 




f 


d 3 p 

( p 2 — wz 2 ) 3 


( 8 ) 


As I remarked in exercise II.1.12, in (2 + 1 (-dimensional spacetime the y^’s are just the 
three Pauli matrices and thus tr(y v y^y fl ) is proportional to g^ v2 -: The antisymmetric 
symbol appears as we expect from P and T violation. 

By dimensional analysis, we see that the integral in (8) is up to a numerical constant 
equal to 1/m 3 and so in cancels. 

But be careful! The integral depends only on m 2 and doesn’t know about the sign of m. 
The correct answer is proportional to 1 /\m | 3 , not 1/m 3 . Thus, the coefficient of the Chern- 
Simons term is equal to m 3 /|m| 3 = m/|m| = sign of in, up to a numerical constant. An 
instructive example of an important sign! This makes sense since under P (or T) a Dirac 
field with mass m is transformed into a Dirac field with mass — m . In a parity-invariant 
theory, with a doublet of Dirac fields with masses m and —m a Chern-Simons term should 
not be generated. 


Exercises 

Vl-i-i In a nonrelativistic theory you might think that there are two separate Chern-Simons terms, Sijaidoaj and 

£ ij a o^i a j- Show that gauge invariance forces the two terms to combine into a single Chern-Simons term 
s^a^dyCix- For the Chern-Simons term, gauge invariance implies Lorentz invariance. In contrast, the 
Maxwell term would in general be nonrelativistic, consisting of two terms, /j?. and / 2 ., with an arbitrary 
relative coefficient between them (with f^ v = d^a v — d as usual). 

VI.1.2 By thinking about mass dimensions, convince yourself that the Chern-Simons term dominates the 
Maxwell term at long distances. This is one reason that relativistic field theorists find anyon fluids so 
appealing. As long as they are interested only in long distance physics they can ignore the Maxwell 
term and play with a relativistic theory (see exercise VI.1.1). Note that this picks out (2 + 1)-dimensional 
spacetime as special. In (3 + 1)-dimensional spacetime the generalization of the Chern-Simons term 
£ fj,vXa f^ v f Xo has the same mass dimension as the Maxwell term / 2 . In (4 + l)-dimensional space the 
term £ pp ' vXa a p f flv f X(T is less important at long distances than the Maxwell term / 2 . 

Vl.i.3 There is a generalization of the Chern-Simons term to higher dimensional spacetime different from 
that given in exercise IV.1.2. We can introduce a p-form gauge potential (see chapter IV.4). Write the 
generalized Chern-Simons term in (2p + 1)-dimensional spacetime and discuss the resulting theory. 

VI.1.4 Consider C = yas da — (1/4 g 2 )f 2 . Calculate the propagator and show that the gauge boson is massive. 
Some physicists puzzled by fractional statistics have reasoned that since in the presence of the Maxwell 
term the gauge boson is massive and hence short ranged, it can’t possibly generate fractional statistics, 
which is manifestly an infinite ranged interaction. (No matter how far apart the two particles we are 
interchanging are, the wave function still acquires a phase.) The resolution is that the information is 
in fact propagated over an infinite range by a q = 0 pole associated with a gauge degree of freedom. 
This apparent paradox is intimately connected with the puzzlement many physicists felt when they first 
heard of the Aharonov-Bohm effect. How can a particle in a region with no magnetic field whatsoever 
and arbitrarily far from the magnetic flux know about the existence of the magnetic flux? 
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VI.i .5 Show that 0 = l/4y. There is a somewhat tricky factor 1 of 2. So if you are off by a factor of 2, don’t despair. 

Try again. 

VI.i.6 Find the nonabelian version of the Chern-Simons term ada. [Hint: As in chapter IV.6 it might be easier 
to use differential forms.] 

VI.1.7 Using the canonical formalism of chapter 1.8 show that the Chern-Simons Lagrangian leads to the 
Hamiltonian H = 0. 

VI.i.8 Evaluate (6). 


1 X.G. Wen and A. Zee J. de Physique, 50: 1623, 1989. 



Quantum Hall Fluids 


V I 




Interplay between two pieces of physics 

Over the last decade or so, the study of topological quantum fluids (of which the Hall fluid 
is an example) has emerged as an interesting subject. The quantum Hall system consists 
of a bunch of electrons moving in a plane in the presence of an external magnetic field 
B perpendicular to the plane. The magnetic field is assumed to be sufficiently strong so 
that the electrons all have spin up, say, so they may be treated as spinless fermions. As is 
well known, this seemingly innocuous and simple physical situation contains a wealth of 
physics, the elucidation of which has led to two Nobel prizes. This remarkable richness 
follows from the interplay between two basic pieces of physics. 

1. Even though the electron is pointlike, it takes up a finite amount of room. 

Classically, a charged particle in a magnetic field moves in a Larmor circle of radius 

r determined by evB = mv 2 /r. Classically, the radius is not fixed, with more energetic 
particles moving in larger circles, but if we quantize the angular momentum mvr to be 
h — 2 :r (in units in which h is equal to unity) we obtain eBr 2 ~ 2ic. A quantum electron 
takes up an area of order 7tr 2 ~ 2tc 2 /eB. 

2. Electrons are fermions and want to stay out of each other’s way. 

Not only does each electron insist on taking up a finite amount of room, each has to 
have its own room. Thus, the quantum Hall problem may be described as a sort of housing 
crisis, or as the problem of assigning office space at an Institute for Theoretical Physics to 
visitors who do not want to share offices. 

Already at this stage, we would expect that when the number of electrons N e is just 
right to fill out space completely, namely when N e itr 2 ~ N e (27t 2 /eB) ~ A, the area of the 
system, something special happens. 
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Landau levels and the integer Hall effect 

These heuristic considerations could be made precise, of course. The textbook problem of 
a single spinless electron in a magnetic field 

— [(3 X — ieA x ) 2 + (d y — ieA y ) 2 ]i/r = 2mE\[r 

was solved by Landau decades ago. The states occur in degenerate sets with energy 
E n — ( n + 2 ) n — 0, 1, 2, • • • , known as the nth Landau level. Each Landau level has 
degeneracy BA/2jt, where A is the area of the system, reflecting the fact that the Larmor 
circles may be placed anywhere. Note that one Landau level is separated from the next by 
a finite amount of energy (eB/m). 

Imagine putting in noninteracting electrons one by one. By the Pauli exclusion principle, 
each succeeding electron we put in has to go into a different state in the Landau level. Since 
each Landau level can hold BA/Itt electrons it is natural (see exercise VI.2.1) to define a 
filling factor v = N e /(BA/2jt). When v is equal to an integer, the first v Landau levels are 
filled. If we want to put in one more electron, it would have to go into the (v + l)st Landau 
level, costing us more energy than what we spent for the preceding electron. 

Thus, for v equal to an integer the Hall fluid is incompressible. Any attempt to compress 
it lessens the degeneracy of the Landau levels (the effective area A decreases and so the 
degeneracy BA/2jt decreases) and forces some of the electrons to the next level, costing 
us lots of energy. 

An electric field E y imposed on the Hall fluid in the y direction produces a current 
J x = axyEy in the x direction with a xy = v (in units of e 2 /h). This is easily understood in 
terms of the Lorentz force law obeyed by electrons in the presence of a magnetic field. The 
surprising experimental discovery was that the Hall conductance a xy when plotted against 
B goes through a series of plateaus, which you might have heard about. To understand 
these plateaus we would have to discuss the effect of impurities. I will touch upon the 
fascinating subject of impurities and disorder in chapter VI. 8 . 

So, the integer quantum Hall effect is relatively easy to understand. 


Fractional Hall efFect 

After the integer Hall effect, the experimental discovery of the fractional Hall effect, 
namely that the Hall fluid is also incompressible for filling factor v equal to simple odd- 
denominator fractions such as | and l, took theorists completely by surprise. For v = ], 
only one-third of the states in the first Landau level are filled. It would seem that throwing 
in a few more electrons would not have that much effect on the system. Why should the 
v = 3 Hall fluid be incompressible? 

Interaction between electrons turns out to be crucial. The point is that saying the first 
Landau level is one-third filled with noninteracting spinless electrons does not define a 
unique many-body state: there is an enormous degeneracy since each of the electrons can 
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go into any of the BA/2n states available subject only to Pauli exclusion. But as soon as we 
turn on a repulsive interaction between the electrons, a presumably unique ground state 
is picked out within the vast space of degenerate states. Wen has described the fractional 
Hall state as an intricate dance of electrons: Not only does each electron occupy a finite 
amount of room on the dance floor, but due to the mutual repulsion, it has to be careful 
not to bump into another electron. The dance has to be carefully choreographed, possible 
only for certain special values of v. 

Impurities also play an essential role, but we will postpone the discussion of impurities 
to chapter VI.6. 

In trying to understand the fractional Hall effect, we have an important clue. You will 
remember from chapter V.7 that the fundamental unit of flux is given by In , and thus the 
number of flux quanta penetrating the plane is equal to N^ — BA/In . Thus, the puzzle is 
that something special happens when the number of flux quanta per electron /N e = v -1 
is an odd integer. 

I arranged the chapters so that what you learned in the previous chapter is relevant to 
solving the puzzle. Suppose that v _1 flux quanta are somehow bound to each electron. 
When we interchange two such bound systems there is an additional Aharonov-Bohm 
phase in addition to the (—1) from the Fermi statistics of the electrons. For v -1 odd these 
bound systems effectively obey Bose statistics and can be described by a complex scalar 
field <p. The condensation of <p turns out to be responsible for the physics of the quantum 
Hall fluid. 


Effective field theory of the Hall fluid 

We would like to derive an effective field theory of the quantum Hall fluid, first obtained 
by Kivelson, Hansson, and Zhang. There are two alternative derivations, a long way and a 
short way. 

In the long way, we start with the Lagrangian describing spinless electrons in a magnetic 
field in the second quantized formalism (we will absorb the electric charge e into A fl ), 

C = i(d 0 -iA 0 )f+- - i A,)V + V^x/r) (1) 

2m 

and massage it into the form we want. In the previous chapter, we learned that by intro¬ 
ducing a Chern-Simons gauge field we can transform iJj into a scalar field. We then invoke 
duality, which we will learn about in the next chapter, to represent the phase degree of 
freedom of the scalar field as a gauge field. After a number of steps, we will discover that 
the effective theory of the Hall fluid turns out to be a Chern-Simons theory. 

Instead, I will follow the short way. We will argue by the “what else can it be” method 
or, to put it more elegantly, by invoking general principles. 

Let us start by listing what we know about the Hall system. 

1. We live in (2 + 1)-dimensional spacetime (because the electrons are restricted to a plane.) 

2. The electromagnetic current J is conserved: = 0. 
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These two statements are certainly indisputable; when combined they tell us that the 
current can be written as the curl of a vector potential 

^ vk d v a k (2) 

Z7T 

The factor of 1/(27 t) defines the normalization of a We learned in school that in 3- 
dimensional spacetime, if the divergence of something is zero, then that something is 
the curl of something else. That is precisely what (2) says. The only sophistication here is 
that what we learned in school works in Minkowskian space as well as Euclidean space—it 
is just a matter of a few signs here and there. 


The gauge potential comes looking for us 

Observe that when we transform a^ by A, the current is unchanged. In other 

words, a M is a gauge potential. 

We did not go looking for a gauge potential; the gauge potential came looking for us! 
There is no place to hide. The existence of a gauge potential follows from completely 
general considerations. 

3. We want to describe the system field theoretically by an effective local Lagrangian. 

4. We are only interested in the physics at long distance and large time, that is, at small 
wave number and low frequency. 

Indeed, a field theoretic description of a physical system may be regarded as a means of 
organizing various aspects of the relevant physics in a systematic way according to their 
relative importance at long distances and according to symmetries. We classify terms in a 
field theoretic Lagrangian according to powers of derivatives, powers of the fields, and so 
forth. A general scheme for classifying terms is according to their mass dimensions, as 
explained in chapter III.2. The gauge potential a^ has dimension 1, as is always the case for 
any gauge potential coupled to matter fields according to the gauge principle, and thus (2) 
is consistent with the fact that the current has mass dimension 2 in (2 + 1)-dimensional 
spacetime. 

5. Parity and time reversal are broken by the external magnetic field. 

This last statement is just as indisputable as statements 1 and 2. The experimentalist 
produces the magnetic field by driving a current through a coil with the current flowing 
either clockwise or anticlockwise. 

Given these five general statements we can deduce the form of the effective Lagrangian. 

Since gauge invariance forbids the dimension-2 term a^a 11 in the Lagrangian, the 
simplest possible term is in fact the dimension-3 Chern-Simons term e^ vX a^d v ax- Thus, 
the Lagrangian is simply 

C = —aeda + • • • (3) 

47T 

where k is a dimensionless parameter to be determined. 
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Wehave introduced and will use henceforth the compact notation efl3£> = d v b- f = 

ebda for two vector fields a fl and b fl . 

The terms indicated by (• ■ ■) in (3) include the dimension-4 Maxwell term (1 /g 2 )(/ 0 2 — 
fif?.) and other terms with higher dimensions. (Here /3 is some constant; see exer¬ 
cise VI.1.1.) The important observation is that these higher dimensional terms are less 
important at long distances. The long distance physics is determined purely by the Chern- 
Simons term. In general the coefficient k may well be zero, in which case the physics is 
determined by the short distance terms represented by the (• • •) in (3). Put differently, a 
Hall fluid may be defined as a 2-dimensional electron system for which the coefficient of 
the Chern-Simons term does not vanish, and consequently is such that its long distance 
physics is largely independent of the microscopic details that define the system. Indeed, 
we can classify 2 -dimensional electron systems according to whether k is zero or not. 

Coupling the system to an “external” or “additional” electromagnetic gauge potential A 
and using ( 2 ) we obtain (after integrating by parts and dropping a surface term) 

£ = = ±e^ v \d v a x - ±e<* v \d v A k (4) 

4?r 2n An In 

Note that the gauge potential of the magnetic field responsible for the Hall effect should 

not be included in A it is implicitly contained already in the coefficient k. 

The notion of quasiparticles or “elementary” excitations is basic to condensed matter 
physics. The effects of a many-body interaction may be such that the quasiparticles in the 
system are no longer electrons. Here we define the quasiparticles as the entities that couple 
to the gauge potential and thus write 

C = ^T aeda + a iJ 11 ~ ( ,lvk a IL d v A k ■■■ (5) 


Defining — (l/27r)e /JvA 3 v A 2 and integrating out the gauge field we obtain (see 

VI.1.4) 


C = 


n - f ^ v % \ - 

k V 3 2 / Jk 


( 6 ) 


Fractional charge and statistics 

We can now simply read off the physics from (6). The Lagrangian contains three types 
of terms: AA, Aj, and jj. The AA term has the schematic form A(ededed/d 2 )A. Using 
eded ~ 3 2 and canceling between numerator and denominator, we obtain 

C=— AedA (7) 

Ank 

Varying with respect to A we determine the electromagnetic current 
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We learn from the fi — 0 component of this equation that an excess density Sn of electrons 
is related to a local fluctuation of the magnetic field by S n = (1/27T k) 8 B; thus we can identify 
the filling factor v as 1/A:, and from the ju = i components that an electric field produces 
a current in the orthogonal direction with <r xy — (1/k) = v. 

The Aj term has the schematic form A(eded/d 2 ) j. Canceling the differential operators, 
we find 

C =\ A ^ ( 9 ) 

Thus, the quasiparticle carries electric charge 1/A:. 

Finally, the quasiparticles interact with each other via 


n 

k d 2 


( 10 ) 


We simply remove the twiddle sign in ( 6 ). Recalling chapter VI. 1 we see that quasiparticles 
obey fractional statistics with 


e _ 1 

n k 


( 11 ) 


By now, you may well be wondering that while all this is fine and good, what would 
actually tell us that v -1 has to be an odd integer? 

We now argue that the electron or hole should appear somewhere in the excitation 
spectrum. After all, the theory is supposed to describe a system of electrons and thus 
far our rather general Lagrangian does not contain any reference to the electron! 

Let us look for the hole (or electron). We note from (9) that a bound object made up of 
k quasiparticles would have charge equal to 1. This is perhaps the hole! For this to work, 
we see that k has to be an integer. So far so good, but k doesn’t have to be odd yet. 

What is the statistics of this bound object? Let us move one of these bound objects half¬ 
way around another such bound object, thus effectively interchanging them. When one 
quasiparticle moves around another we pick up a phase given by 9/n — 1/k according to 
(11). But here we have k quasiparticles going around k quasiparticles and so we pick up a 
phase 


- = -k 2 = k ( 12 ) 

7Z k 


For the hole to be a fermion we must require 0/n to be an odd integer. This fixes k to be 
an odd integer. 

Since v — 1/ k, we have here the classic Laughlin odd-denominator Hall fluids with filling 
factor v=l,l,!,•••. The famous result that the quasiparticles carry fractional charge and 
statistics just pops out [see (9) and (11)]. 

This is truly dramatic: a bunch of electrons moving around in a plane with a magnetic 
field corresponding to v — 1 , and lo and behold, each electron has fragmented into three 
pieces, each piece with charge 1 and fractional statistics 3 ! 
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A new kind of order 

The goal of condensed matter physics is to understand the various states of matter. States 
of matter are characterized by the presence (or absence) of order: a ferromagnet becomes 
ordered below the transition temperature. In the Landau-Ginzburg theory, as we saw in 
chapter V.3, order is associated with spontaneous symmetry breaking, described naturally 
with group theory. Girvin and MacDonald first noted that the order in Hall fluids does not 
really fit into the Landau-Ginzburg scheme: We have not broken any obvious symmetry. 
The topological property of the Hall fluids provides a clue to what is going on. As explained 
in the preceding chapter, the ground state degeneracy of a Hall fluid depends on the 
topology of the manifold it lives on, a dependence group theory is incapable of accounting 
for. Wen has forcefully emphasized that the study of topological order, or more generally 
quantum order, may open up a vast new vista on the possible states of matter. 1 


Comments and generalization 

Let me conclude with several comments that might spur you to explore the wealth of 
literature on the Hall fluid. 

1. The appearance of integers implies that our result is robust. A slick argument can 
be made based on the remark in the previous chapter that the Chern-Simons term does 
not know about clocks and rulers and hence can’t possibly depend on microphysics such 
as the scattering of electrons off impurities which cannot be defined without clocks and 
rulers. In contrast, the physics that is not part of the topological field theory and described 
by (• ■ ■) in (3) would certainly depend on detailed microphysics. 

2. If we had followed the long way to derive the effective field theory of the Hall fluid, we 
would have seen that the quasiparticle is actually a vortex constructed (as in chapter V.7) 
out of the scalar field representing the electron. Given that the Hall fluid is incompressible, 
just about the only excitation you can think of is a vortex with electrons coherently whirling 
around. 

3. In the previous chapter we remarked that the Chern-Simons term is gauge invariant 
only upon dropping a boundary term. But real Hall fluids in the laboratory live in samples 
with boundaries. So how can (3) be correct? Remarkably, this apparent “defect” of the theory 
actually represents one of its virtues! Suppose the theory (3) is defined on a bounded 2- 
dimensional manifold, a disk for example. Then as first argued by Wen there must be 
physical degrees of freedom living on the boundary and represented by an action whose 
change under a gauge transformation cancels the change of f d^x(k/An)ae da. Physically, 
it is clear that an incompressible fluid would have edge excitations 2 corresponding to waves 
on its boundary. 

1 X. G. Wen, Quantum Field Theory of Many-Body Systems. 

2 The existence of edge currents in the integer Hall fluid was first pointed out by Halperin. 
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4. What if we refuse to introduce gauge potentials? Since the current J /( has dimension 
2, the simplest term constructed out of the currents, J^J 11 , is already of dimension 4; 
indeed, this is just the Maxwell term. There is no way of constructing a dimension 3 local 
interaction out of the currents directly. To lower the dimension we are forced to introduce 
the inverse of the derivative and write schematically J(l/ed)J, which is of course just the 
non-local Hopf term. Thus, the question “why gauge field?” that people often ask can be 
answered in part by saying that the introduction of gauge fields allows us to avoid dealing 
with nonlocal interactions. 

5. Experimentalists have constructed double-layered quantum Hall systems with an 
infinitesimally small tunneling amplitude for electrons to go from one layer to the other. 
Assuming that the current jj 1 (I = 1, 2) in each layer is separately conserved, we introduce 
two gauge potentials by writing Jf — ^€^ vX d v a lk as in (2). We can repeat our general 
argument and arrive at the effective Lagrangian 

T = y —^ajedaj + ■ • ■ (13) 

i,j 4n 

The integer k has been promoted to a matrix K. As an exercise, you can derive the Hall 
conductance, the fractional charge, and the statistics of the quasiparticles. You would not 
be surprised that everywhere 1 /k appears we now have the matrix inverse K _1 instead. 
An interesting question is what happens when K has a zero eigenvalue. For example, we 
could have K = ^ J j ^ . Then the low energy dynamics of the gauge potential a_ = cty — a 2 
is not governed by the Chern-Simons term, but by the Maxwell term in the (• • •) in (13). 
We have a linearly dispersing mode and thus a superfluid! This striking prediction 3 was 
verified experimentally. 

6 . Finally, an amusing remark: In this formalism electron tunneling corresponds to the 

nonconservation of the current/^ = j[' — — (l/27r)e A ' l;A 3 y a_^.Thedifference N\ — N 2 

of the number of electrons in the two layers is not conserved. But how can 3^ J^_ ^ 0 even 
though is the curl of a_ k (as I have indicated explicitly)? Recalling chapter IV.4, you the 
astute reader say, aha, magnetic monopoles! Tunneling in a double-layered Hall system in 
Euclidean spacetime can be described as a gas of monopoles and antimonopoles. 4 (Think, 
why monopoles and antimonopoles?) Note of course that these are not monopoles in the 
usual electromagnetic gauge potential but in the gauge potential a_ k . 

What we have given in this section is certainly a very slick derivation of the effective 
long distance theory of the Hall fluid. Some would say too slick. Eet us go back to our 
five general statements or principles. Of these five, four are absolutely indisputable. In 
fact, the most questionable is the statement that looks the most innocuous to the casual 
reader, namely statement 3. In general, the effective Lagrangian for a condensed matter 
system would be nonlocal. We are implicitly assuming that the system does not contain a 
massless field, the exchange of which would lead to a nonlocal interaction. 5 Also implicit 

3 X. G. Wen and A. Zee, Phys. Rev. Lett. 69: 1811, 1992. 

4 X. G. Wen and A. Zee, Phys. Rev. B47: 2265, 1993. 

5 A technical remark: Vortices (i.e., quasiparticles) pinned to impurities in the Hall fluid can generate an 
interaction nonlocal in time. 
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in (3) is the assumption that the Lagrangian can be expressed completely in terms of the 
gauge potential a. A priori, we certainly do not know that there might not be other relevant 
degrees of freedom. The point is that as long as these degrees of freedom are not gapless 
they can be safely integrated out. 


Exercises 

VI. 2.1 To define filling factor precisely, we have to discuss the quantum Hall system on a sphere rather than on 
a plane. Put a magnetic monopole of strength G (which according to Dirac can be only a half-integer or 
an integer) at the center of a unit sphere. The flux through the sphere is equal to N ( p = 2G. Show that the 
single electron energy is given by £/ = ( \hco c ) [/(/ + 1) — G 2 ] /G with the Landau levels corresponding 
to / = G, G + 1, G + 2, ..., and that the degeneracy of the /th level is 21 + 1. With L Landau levels filled 
with noninteracting electrons (v = L) show that = v~ l N e — S, where the topological quantity S is 
known as the shift. 

VI. 2.2 For a challenge, derive the effective field theory for Hall fluids with filling factor v = m/k with k an 
odd integer, such as v = |. [Hint: You have to introduce m gauge potentials aj^ and generalize (2) to 
J^ = (1/2^')c /xvA 3 v Y17= l a i a* effective theory turns out to be 

^ m m 

d = — ^2 a i^u e ^ a j + + • • • 

47r I,J= 1 7=1 

with the integer k replaced by a matrix K . Compare with (13).] 

Vl. 2.3 For the Lagrangian in (13), derive the analogs of ( 8 ), (9), and (11). 



Duality 


V I 




A far reaching concept 

Duality is a profound and far reaching concept 1 in theoretical physics, with origins in 
electromagnetism and statistical mechanics. The emergence of duality in recent years in 
several areas of modern physics, ranging from the quantum Hall fluids to string theory, 
represents a major development in our understanding of quantum field theory. Here I 
touch upon one particular example just to give you a flavor of this vast subject. 

My plan is to treat a relativistic theory first, and after you get the hang of the subject, I 
will go on to discuss the nonrelativistic theory. It makes sense that some of the interesting 
physics of the nonrelativistic theory is absent in the relativistic formulation: A larger 
symmetry is more constraining. By the same token, the relativistic theory is actually much 
easier to understand if only because of notational simplicity. 


Vortices 

Couple a scalar field in (2 + l)-dimensions to an external electromagnetic gauge potential, 
with the electric charge cj indicated explicitly for later convenience: 

C=\W lx -iqA ll )<p\ 1 -V{<p1<p) (1) 

We have already studied this theory many times, most recently in chapter V.7 in connection 
with vortices. As usual, write <p = \(p\e‘ e . Minimizing the potential V at \(p\ = v gives the 
ground state field configuration. Setting tp — ve 10 in (1) we obtain 

C=\v\d li e-qA IJ ) 2 (2) 


1 For a first introduction to duality, I highly recommend J. M. Figueroa-O’Farrill, Electromagnetic Duality for 
Children, http://www.maths.ed.ac.uk/~jmf/Teaching/Lectures/EDC.html. 
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which upon absorbing 9 into A by a gauge transformation we recognize as the Meissner 
Lagrangian. For later convenience we also introduce the alternative form 

£ = + (3) 

We recover (2) upon eliminating the auxiliary field 1 (see appendix A and chapter III.5). 

In chapter V.7 we learned that the excitation spectrum includes vortices and anti¬ 
vortices, located where \q>\ vanishes. If, around the zero of \<p\, 9 changes by 2 tt, we have 
a vortex. Around an antivortex, A 9 — — 2n. Recall that around a vortex sitting at rest, the 
electromagnetic gauge potential has to go as 

<7 A t d t 9 (4) 

at spatial infinity in order for the energy of the vortex to be finite, as we can see from (2). 
The magnetic flux 

J d 2 xs ij d i A j = <j,dZ-A = A9vortex = — (5) 

is quantized in units of 27 r/q. 

Let us pause to think physically for a minute. On a distance scale large compared to the 
size of the vortex, vortices and antivortices appear as points. As discussed in chapter V.7, 
the interaction energy of a vortex and an antivortex separated by a distance R is given 
by simply plugging into (2). Ignoring the probe field A /( , which we can take to be as 
weak as possible, we obtain ~ f dr r(V9) 2 ~ log (R/a) where a is some short distance 
cutoff. But recall that the Coulomb interaction in 2-dimensional space is logarithmic since 
by dimensional analysis / d 2 k(e‘ k ' x /k 2 ) ~log(|x|/a) (with a -1 some ultraviolet cutoff). 
Thus, a gas of vortices and antivortices appears as a gas of point “charges” with a Coulomb 
interaction between them. 


Vortex as charge in a dual theory 

Duality is often made out by some theorists to be a branch of higher mathematics but 
in fact it derives from an entirely physical idea. In view of the last paragraph, can we not 
rewrite the theory so that vortices appear as point “charges” of some as yet unknown gauge 
field? In other words, we want a dual theory in which the fundamental field creates and 
annihilates vortices rather than <p quanta. We will explain the word “dual” in due time. 

Remarkably, the rewriting can be accomplished in just a few simple steps. Proceeding 
physically and heuristically, we picture the phase field 9 as smoothly fluctuating, except 
that here and there it winds around 2jt . Write 3 ^9 = 3^Smooth + 3, t ^vortex • Plugging into 
(3) we write 

£ — ~ 2 yi^ £ M f®/iAmooth + 9;(^vortex — <?A^) (6) 

Integrate over 0 smoot h and obtain the constraint 3 /x <f /J = 0, which can be solved by writing 

^ = e» vk d v a k (7) 
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a trick we used earlier in chapter VI.2. As in that chapter, a gauge potential comes looking 
for us, since the change a x -* a x + d^A does not change ^. Plugging into (6), we find 

C= -^2C + ^^'^Aortex - q V (8) 

where f /JLV = d /1 a v - d v a^ 

Our treatment is heuristic because we ignore the fact that \(p\ vanishes at the vortices. 
Physically, we think of the vortices as almost pointlike so that \<p\ = v “essentially” every¬ 
where. As mentioned in chapter V.6, by appropriate choice of parameters we can make 
solitons, vortices, and so on as small as we like. In other words, we neglect the coupling 
between d vortex and \<p \. A rigorous treatment would require a proper short distance cutoff 
by putting the system on a lattice. 2 But as long as we capture the essential physics, as we 
assuredly will, we will ignore such niceties. 

Note for later use that the electromagnetic current J^, defined as the coefficient of— 
in (8), is determined in terms of the gauge potential a x to be 

= q^ vX d v a x ( 9 ) 

Let us integrate the term e^dyd^d ^vortex i n (8) by parts to obtain Gq£ A/iv 3 M 3 v d vortex . 
According to Newton and Leibniz, 3^ commutes with d v , and so apparently we get zero. 
But 3^ and 3 V commute only when acting on a globally defined function, and, heavens to 
Betsy, 0 vortex is not globally defined since it changes by 2tt when we go around a vortex. 
In particular, consider a vortex at rest and look at the quantity a 0 couples to in (8) namely 
e' J 3 ( 3,-0 vortex = V x (V0 vortex ) in the notation of elementary physics. Integrating this over 
a region containing the vortex gives / d 2 x V x (V$ vortex ) = <fi dx ■ V0 vortex = 2 tt. Thus, we 
recognize (l/2jr )e'> 3,- 3 ,-0 V ortex as the density of vortices, the time component of some vortex 
current j y A ortex . By Lorentz invariance, ; A ortex = (l/27r)e A ' iv 3 / , ( 3 l ,0 vortex . 

Thus, we can now write (8) as 

C = -^l4 2 v + (2*)«/Xrtex - A^qs^dya,) (10) 

Lo and behold, we have accomplished what we set out to do. We have rewritten the theory 
so that the vortex appears as an “electric charge” for the gauge potential a^ L . Sometimes 
this is called a dual theory, but strictly speaking, it is more accurate to refer to it as the dual 
representation of the original theory (1). 

Let us introduce a complex scalar field <t>, which we will refer to as the vortex field, 
to create and annihilate the vortices and antivortices. In other words, we “elaborate” the 
description in (10) to 

C = - 2(27r)a M )0| 2 - IV(<D) - A^qs^dyaO (11) 


2 For example, M. P. A. Fisher, “Mott Insulators, Spin Liquids, and Quantum Disordered Superconductivity,” 
cond-mat/9806164, appendix A. 



334 I VI. Field Theory and Condensed Matter 


The potential W(<t>) contains terms such as describing the short distance inter¬ 

action of two vortices (or a vortex and an antivortex.) In principle, if we master all the short 
distance physics contained in the original theory (1) then these terms are all determined 
by the original theory. 


Vortex of a vortex 

Now we come to the most fascinating aspect of the duality representation and the reason 
why the word “dual” is used in the first place. The vortex field <l> is a complex scalar field, 
just like the field tp we started with. Thus we can perfectly well form a vortex out of 4>, 
namely a place where <I> vanishes and around which the phase of <t> goes through 2n. 
Amusingly, we are forming a vortex of a vortex, so to speak. 

So, what is a vortex of a vortex? 

The duality theorem states that the vortex of a vortex is nothing but the original charge, 
described by the field tp we started out with! Hence the word duality. 

The proof is remarkably simple. The vortex in the theory (11) carries “magnetic flux.” 
Referring to (11) we see that 2 jt a, -> d, 9 at spatial infinity. By exactly the same manipulation 
as in (5), we have 

2n j d 2 XEijdiCij = 2n <j) dx ■ d = 2 ti (12) 

Note that I put quotation marks around the term “magnetic flux” since as is evident I am 
talking about the flux associated with the gauge potential a^ and not the flux associated 
with the electromagnetic potential A^. But remember that from (9) the electromagnetic 
current = qE^ v2 d v ax and in particular J° — qs'-’d/aj. Hence, the electric charge (note 
no quotation marks) of this vortex of a vortex is equal to f d 2 xJ° = q, precisely the charge 
of the original complex scalar field cp. This proves the assertion. 

Here we have studied vortices, but the same sort of duality also applies to monopoles. 
As I remarked in chapter IV.4, duality allows us a glimpse into field theories in the strongly 
coupled regime. We learned in chapter V.7 that certain spontaneously broken nonabelian 
gauge theories in (3 + 1)-dimensional spacetime contains magnetic monopoles. We can 
write a dual theory in terms of the monopole field out of which we can construct mono¬ 
poles. The monopole of a monopole turns out be none other than the charged fields of the 
original gauge theory. This duality was first conjectured many years ago by Olive and Mon- 
tonen and later shown to be realized in certain supersymmetric gauge theories by Seiberg 
and Witten. The understanding of this duality was a “hot” topic a few years ago as it led 
to deep new insight about how certain string theories are dual to each other. 3 In contrast, 
according to one of my distinguished condensed matter colleagues, the important notion 
of duality is still underappreciated in the condensed matter physics community. 


3 For example, D. I. Olive and P. C. West, eds., Duality and Supersymmetric Theories. 
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Meissner begets Maxwell and so on 

We will close by elaborating slightly on duality in (2 + 1)-dimensional spacetime and how 
it might be relevant to the physics of 2-dimensional materials. Consider a Lagrangian C(a) 
quadratic in a vector field a Couple an external electromagnetic gauge potential A M to 
the conserved current s^ vk d v a x : 

L = C(a) + A^ vk d v a x ) (13) 

Let us ask: For various choices of C(a), if we integrate out a what is the effective 
Lagrangian £(A) describing the dynamics of A? 

If you have gotten this far in the book, you can easily do the integration. The central 
identity of quantum field theory again! Given 


C{a) ~ 

aKa 

(14) 

we have 



C(A) - 

- (e3A) — (edA) ~ A(ed—sd)A 

K K 

(15) 


We have three choices for C(a) to which I attach various illustrious names: 


C(a) ~ a 2 

Meissner 

C(a) ~ asda 

Chern-Simons 

C{a) ~ f 2 ~ ad 2 a 

Maxwell 


Since we are after conceptual understanding, I won’t bother to keep track of indices and 
irrelevant overall constants. (You can fill them in as an exercise.) For example, given 
£(a) — with / = d /x a v — 3 v a^ we can write C(a) ~ ad 2 a and so K = d 2 . Thus, 

the effective dynamics of the external electromagnetic gauge potential is given by (15) as 
£(A) ~ A[e3(l/3 2 )e3]A ~ A 2 , the Meissner Lagrangian! In this “quick and dirty” way of 
making a living, we simply set ss ~ 1 and cancel factors of 3 in the numerator against 
those in the denominator. Proceeding in this way, we construct the following table: 


Dynamics of a 

K 

Effective Lagrangian 

C(A) ~ A[ed{\/K)sd\A 

Dynamics of the 
external probe A 

Meissner a 2 

1 

A(sded)A ~ A3 2 A 

Maxwell F 2 

Chern-Simons asda 

e3 

A(e3^e3)A~ Ae3A 

Chern-Simons 

Ae3A 

Maxwell f 2 ~ ad 2 a 

3 2 

A(e3pg3)A ~ AA 

Meissner A 2 


( 17 ) 
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Meissner begets Maxwell, Chern-Simons begets Chern-Simons, and Maxwell begets 
Meissner. I find this beautiful and fundamental result, which represents a form of duality, 
very striking. Chern-Simons is self-dual: It begets itself. 


Going nonrelativistic 

It is instructive to compare the nonrelativistic treatment of duality . 4 Go back to the super- 
fluid Lagrangian of chapter V.l: 

£ = i(p^d 0 <p - -^—djip^djip - g 2 ((p Jf (p - p ) 2 (18) 

2m 

As before, substitute (p = ^fpe' 6 to obtain 

C = -pd 0 9-^~ (3 id) 2 -g\p-p) 2 +--- (19) 

2m 

which we rewrite as 

C = -i^e+^i 2 -g 2 (p-p) 2 +--- (20) 

2 P 

In (19) we have dropped a term ~ (3,-p 1 / 2 ) 2 . In (20) we have defined Qq = p. Integrating 
out in (20) we recover (19). 

All proceeds as before. Writing 0 = 0 smoot h + 0 YOItex and integrating out 0 smoo th> we 
obtain the constraint = 0, solved by writing ^ = e^ vX d v d A . The hat on a x is for later 
convenience. Note that the density 

£o = P = ^ijMj = f (21) 

is the “magnetic” field strength while 

ii = *ij (3o Oj - djd 0 ) = e tj f oj ( 22 ) 

is the “electric” field strength. 

Putting all of this into (20) we have 

£ = f - fot - s\f - f» 2 - 2 naXorte* + • • • (23) 

2 P 

To “subtract out” the background “magnetic” field p, an obviously sensible move is to write 
+ a i, (24) 

where we define the background gauge potential by a 0 = 0 , 3 0 d,- = 0 (no background 
“electric” field) and 

e,jd,dj - p (25) 


4 The treatment given here follows essentially that given by M. P. A. Fisher and D. H. Lee. 
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The Lagrangian (23) then takes on the cleaner form 


^ ~ ( 2 p f°i ^ f ) 27ra /jJvortex 2 ^ a ij VO itex + ' ' ' 


(26) 


We have expanded p ~ p in the first term. As in (10) the first two terms form the Maxwell 
Lagrangian, and the ratio of their coefficients determines the speed of propagation 


2gV V2 


(27) 


In suitable units in which c — 1, we have 


£ = -- 2na^X ortex - 27ra,-4 ortex + • • • (28) 

Compare this with (10). 

The one thing we missed with our relativistic treatment is the last term in (28), for 
the simple reason that we didn’t put in a background. Recall that the term like AjJ t in 
ordinary electromagnetism means that a moving particle associated with the current 
sees a magnetic field V x A. Thus, a moving vortex will see a “magnetic field” 


eijdjfa + a)j = p + e y 3 ( -a y - (29) 

equal to the sum of p, the density of the original bosons, and a fluctuating field. 

In the Coulomb gauge 3= 0 we have (/ 0 ,) 2 = (3 0 a{) 2 + (dja 0 ) 2 , where the cross term 
(doa i )(d i a 0 ) effectively vanishes upon integration by parts. Integrating out the Coulomb 
field a 0 , we obtain 

j 0 (x) log -—— j 0 (y) 
a 

+ zpz Oo a i) 2 - g 2 f 2 + («; + «,)y™ rtex (30) 

2 P 

The vortices repel each other by a logarithmic interaction f d 2 k(e lk ' x /k 2 ) ~ log(|x|/a) as 
we have known all along. 


C= - f (2 ,) 2 /j d 2 xd 2 y 


A self-dual theory 

Interestingly, the spatial part f 2 of the Maxwell Lagrangian comes from the short ranged 
repulsion between the original bosons. 

If we had taken the bosons to interact by an arbitrary potential V (x) we would have, 
instead of the last term in ( 20 ), 

/f d2xd2y ~ p\ v ( x ~ l/°(y) - Pi (31) 

It is easy to see that all the steps go through essentially as before, but now the second term 
in (26) becomes 

// d2xd2 yf^ V( < x ~ lO/OO 


(32) 
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Thus, the gauge field propagates according to the dispersion relation 

co 2 =(2 p/m)V(k)k 2 (33) 

where V(k) is the Fourier transformation of V(x). In the special case V(x) — g 2 S (2) (x) we 
recover the linear dispersion given in (27). Indeed, we have a linear dispersion ®oc \k\ as 
long as V(x) is sufficiently short ranged for V{k — 0) to be finite. 

An interesting case is when V (x) is logarithmic. Then V (k) goes as 1/k 2 and so a> ~ 
constant: The gauge field a i becomes massive and drops out. The low energy effective 
theory consists of a bunch of vortices with a logarithmic interaction between them. Thus, 
a theory of bosons with a logarithmic repulsion between them is self dual in the low energy 
limit. 


The dance of vortices and antivortices 

Having gone through this nonrelativistic discussion of duality, let us reward ourselves by 
deriving the motion of vortices in a fluid. Let the bulk of the fluid be at rest. According to (28) 
the vortex behaves like a charged particle in a background magnetic field b proportional to 
the mean density of the fluid p. Thus, the force acting on a vortex is the usual Lorentz force 
v x B, and the equation of motion of the vortex in the presence of a force F is then just 

PtijXj = F i ( 34 ) 

This is the well-known result that a vortex, when pushed, moves in a direction perpendic¬ 
ular to the force. 

Consider two vortices. According to (30) they repel each other by a logarithmic inter¬ 
action. They move perpendicular to the force. Thus, they end up circling each other. In 
contrast, consider a vortex and an antivortex, which attract each other. As a result of this 
attraction, they both move in the same direction, perpendicular to the straight line joining 
them (see fig. VI.3.1). The vortex and antivortex move along in step, maintaining the dis¬ 
tance between them. This in fact accounts for the famous motion of a smoke ring. If we 
cut a smoke ring through its center and perpendicular to the plane it lies in, we have just 




(a) 


(b) 


Figure VI.3.1 
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a vortex with an antivortex for each section. Thus, the entire smoke ring moves along in a 
direction perpendicular to the plane it lies in. 

All of this can be understood by elementary physics, as it should be. The key observation 
is simply that vortices and antivortices produce circular flows in the fluid around them, say 
clockwise for vortices and anticlockwise for antivortices. Another basic observation is that 
if there is a local flow in the fluid, then any object, be it a vortex or an antivortex, caught in 
it would just flow along in the same direction as the local flow. This is a consequence of 
Galilean invariance. By drawing a simple picture you can see that this produces the same 
pattern of motion as discussed above. 



The cr Models as Effective Field Theories 


V I 


• r 


The Lagrangian as a mnemonic 

Our beloved quantum field theory has had two near death experiences. The first started 
around the mid-1930s when physical quantities came out infinite. But it roared back to 
life in the late 1940s and early 1950s, thanks to the work of the generation that included 
Feynman, Schwinger, Dyson, and others. The second occurred toward the late 1950s. As 
we have already discussed, quantum field theory seemed totally incapable of addressing 
the strong interaction: The coupling was far too strong for perturbation theory to be of any 
use. Many physicists—known collectively as the S-matrix school—felt that field theory 
was irrelevant for studying the strong interaction and advocated a program of trying to 
derive results from general principles without using field theory. For example, in deriving 
the Goldberger-Treiman relation, we could have foregone any mention of field theory and 
Feynman diagrams. 

Eventually, in a reaction against this trend, people realized that if some results could be 
obtained from general considerations such as notions of spontaneous symmetry breaking 
and so forth, any Lagrangian incorporating these general properties had to produce the 
same results. At the very least, the Lagrangian provides a mnemonic for any physical result 
derived without using quantum field theory. Thus was born the notion of long distance 
or low energy effective field theory, which would prove enormously useful in both particle 
and condensed matter physics (as we have already seen and as we will discuss further in 
chapter VIII.3). 


The strong interaction at low energies 

One of the earliest examples is the a model of Gell-Mann and Levy, which describes the 
interaction of nucleons and pions. We now know that the strong interaction has to be 
described in terms of quarks and gluons. Nevertheless, at long distances, the degrees of 
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freedom are the two nucleons and the three pions. The proton and the neutron transform as 
a spinor i/r = ( p ) under the SU (2) of isospin. Consider the kinetic energy term i/r/y9i/r — 
\[f L iyd\lf L + iJr R iydi{r R . We note that this term has the larger symmetry SU(2) L x SU(2) R , 
with the left handed field \// L and the right handed field 1 / r R transforming as a doublet 
under SU (2) L and SU (2 ) R , respectively. [The SU (2) of isospin is the diagonal subgroup 
of SU(2) l x SU (2)^.] We can write 0) and 1 j/ R ~ (0, J). 

Now we see a problem immediately: The mass term mi/ri/r = m(\jf L \ls R + h.c.) is not 
allowed since 1 Jr L \l/ R ~ ( 3 , |), a 4-dimensional representation of SU(2) L x SU(2) R , a 
group locally isomorphic to SO( 4). 

At this point, lesser physicists would have said, what is the problem, we knew all along 
that the strong interaction is invariant only under the SU (2) of isospin, which we will 
write as SU (2) 7 . Under SU (2)/ the bilinears constructed out of 1 J/ L and \[r R transform as 
2 x i =0 + 1, the singlet being i/n/r and the triplet i/n'y 5 r a i/r. With only SU (2 ) 7 symmetry, 
we can certainly include the mass term i/ri/r. 

To say it somewhat differently, to fully couple to the four bilinears we can construct out 
of 1 jr L and 1 j/ R , namely 1 Jn// and \j/iy 5 r a \j/, we need four meson fields transforming as the 
vector representation under SO( 4). But only the three pion fields are known. It seems 
clear that we only have SU(2)j symmetry. 

Nevertheless, Gell-Mann and Levy boldly insisted on the larger symmetry SU{2) L x 
SU(2) r ~ 50(4) and simply postulated an additional meson field, which they called ct, 
so that (cr, if) form the 4-dimensional representation. I leave it to you to verify that 
1 if L {o + it ■ tt)\I/ r + h.c. = 1 j/(a + it • if y 5 )i/r is invariant. Hence, we can write down the 
invariant Lagrangian 

C = + g(cr + ir ■ ny 5 )]xl/ + C(o , n) (1) 

where the part not involving the nucleons reads 

C(a, n)=- ((da) 2 + (Si) 2 ) + — (a 2 + n 2 ) - -{a 2 + n 1 ) 2 (2) 

This is known as the linear a model. 

The a model would have struck most physicists as rather strange at the time it was 
introduced: The nucleon does not have a mass and there is an extra meson field. Aha, but 
you would recognize (2) as precisely the Lagrangian (IV. 1.2) (for N — 4) that we studied, 
which exhibits spontaneous symmetry breaking. The four scalar fields (^ 4 , (pi, (p 2 , (p 2 ) 
in (IV.1.2) correspond to (cr, if). With no loss of generality, we can choose the vacuum 
expectation value of (p to point in the 4th direction, namely the vacuum in which (0| a |0) = 
vV/k = v and (0| if |0) = 0. Expanding a — v + o' we see immediately that the nucleon 
has a mass M = gv. You should not be surprised that the pion comes out massless. The 
meson associated with the field a 1 , which we will call the a meson, has no reason to be 
massless and indeed is not. 

Can the all-important parameter v be related to a measurable quantity? Indeed. From 
chapter 1.10 you will recall that the axial current is given by Noether’s theorem as 
y" 5 = \j/y^ L y^(r a /2)^f + — ad^7T a . After <7 acquires a vacuum expectation value. 
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J“ s contains a term —vd^jr ". This term implies that the matrix element (0| | jr*> = i vk jV 

where k denotes the momentum of the pion, and thus v is proportional to the / defined in 
chapter IV.2. Indeed, we recognize the mass relation M — g i> as precisely the Goldberger- 
Treiman relation (IV.2.7) with F{ 0) = 1 (see exercise VI.4.4). 

The nonlinear o model 

It was eventually realized that the main purpose in life of the potential in C(a , it) is to 
force the vacuum expectation values of the fields to be what they are, so the potential can 
be replaced by a constraint cr 2 + it 2 — v 2 . A more physical way of thinking about this point 
is by realizing that the a meson, if it exists at all, must be very broad since it can decay 
via the strong interaction into two pions. We might as well force it out of the low energy 
spectrum by making its mass large. By now, you have learned from chapters IV.l and V.l 
that the mass of the a meson, namely V2/i, can be taken to infinity while keeping v fixed 
by letting /x 2 and X tend to infinity, keeping their ratio fixed. 

We will now focus on £(er, if). Instead of thinking abut C{a , it) = ^[(da) 2 + (dir) 2 ] 
with the constraint a 2 + it 2 = v 2 , we can simply solve the constraint and plug the solution 
a — Vv 2 — i f 2 into the Lagrangian, thus obtaining what is known as the nonlinear a model: 

£=- Oir) 2 + ^ ~~~r = -(3i) 2 H-(jr-3jr) 2 H- (3) 

2 L f 2 ~ X 2 J 2 2/ 2 ' ' 

Note that C can be written in the form C= {djt a )G ah {Tc){djt h )] some people like to think of 
G ah as a “metric” in field space. [Incidentally, recall that way back in chapter 1.3 we restricted 
ourselves to the simplest possible kinetic energy term \ (d(p) 2 , rejecting possibilities such 
as U(q>) (3 <p) 2 . But recall also that in chapter IV.3 we noted that such a term would arise by 
quantum fluctuations.] 

In accordance with the philosophy that introduced this chapter, any Lagrangian that 
captures the correct symmetry properties should describe the same low energy physics. 1 
This means that anybody, including you, can introduce his or her own parametrization of 
the fields. 

The nonlinear a model is actually an example of a broad class of field theories whose 
Lagrangian has a simple form but with the fields appearing in it subject to some nontrivial 
constraint. An example is the theory defined by 

C(U) = £ trO^t/t • d^U) (4) 

with U(x) a matrix-valued field and an element of SU(2). Indeed, if we write U = e' K '^^ lx ' x 
we see that C(U) — § (Sir) 2 + (1/2 / 2 )(if • dir) 2 + ■ ■ ■, identical to (3) up to the terms 
indicated. The tc field here is related to the one in (3) by a field redefinition. 

There is considerably more we can say about the nonlinear a models and their applica¬ 
tions in particle and condensed matter physics, but a thorough discussion would take us 


1 S. Weinberg, Physica 96A: 327, 1979. 
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far beyond the scope of this book. Instead, I will develop some of their properties in the 
exercises and in the next chapter will sketch how they can arise in one class of condensed 
matter systems. 


Exercises 

VI . 4.1 Show that the vacuum expectation value of (a , it ) can indeed point in any direction without changing the 

physics. At first sight, this statement seems strange since, by virtue of its y 5 coupling to the nucleon, the 
pion is a pseudoscalar field and cannot have a vacuum expectation without breaking parity. But (a, tt) 
are just Greek letters. Show that by a suitable transformation of the nucleon field parity is conserved, as 
it should be in the strong interaction. 

VI. 4.2 Calculate the pion-pion scattering amplitude up to quadratic order in the external momenta, using the 
nonlinear a model (3). [Hint: For help, see S. Weinberg, Phys. Rev. Lett. 17: 616,1966.] 

VI. 4.3 Calculate the pion-pion scattering amplitude up to quadratic order in the external momenta, using the 
linear a model (2). Don’t forget the Feynman diagram involving o meson exchange. You should get the 
same result as in exercise VI.4.2. 

VI. 4.4 Show that the mass relation M = gv amounts to the Goldberger-Treiman relation. 



Ferromagnets and Antiferromagnets 


V I 


.J 


Magnetic moments 

In chapters IV.l and V.3 I discussed how the concept of the Nambu-Goldstone boson 
originated as the spin wave in a ferromagnetic or an antiferromagnetic material. A cartoon 
description of such materials consists of a regular lattice on each site of which sits a 
local magnetic moment, which we denote by a unit vector n ,• with j labeling the site. 
In a ferromagnetic material the magnetic moments on neighboring sites want to point 
in the same direction, while in an antiferromagnetic material the magnetic moments 
on neighboring sites want to point in opposite directions. In other words, the energy is 
7 1 =J Y2<ij> ' «/, where i and j label neighboring sites. For antiferromagnets J > 
0 , and for ferromagnets / < 0. I will merely allude to the fully quantum description 
formulated in terms of a spin Sj operator on each site j; the subject lies far beyond the 
scope of this text. 

In a more microscopic treatment, we would start with a Hamiltonian (such as the 
Hubbard Hamiltonian) describing the hopping of electrons and the interaction between 
them. Within some approximate mean field treatment the classical variable « would then 
emerge as the unit vector pointing in the direction of {c^acj) with c\ and c ; - the electron 
creation and annihilation operators, respectively. But this is not a text on solid state physics. 


First versus second order in time 

Here we would like to derive an effective low energy description of the ferromagnet and 
antiferromagnet in the spirit of the a model description of the preceding chapter. Our 
treatment will be significantly longer than the standard discussion given in some field 
theory texts, but has the slight advantage of being correct. 

The somewhat subtle issue is what kinetic energy term we have to add to —H to form the 
Lagrangian L. Since for a unit vector n we have n ■ ( dn/dt ) = (d{n -n)/dt) — 0, we cannot 
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make do with one time derivative. With two derivatives we can form ( dn/dt) ■ (dn/dt) 
and so 


L 


wrong 


1 

2 ? 


E 


^1 

dt 


d j/i 

dt 


j E 

<ij> 


(i) 


A typical field theory text would then pass to the continuum limit and arrive at the 
Lagrangian density 


1 dn dh 2 v - '' dn dn 

2 g 2 dt dt s 1 dx l dx l 


( 2 ) 


with the constraint [n(x, t)] 2 = 1. This is another example of a nonlinear a model. Just as 
in the nonlinear a model discussed in chapter VI.4, the Lagrangian looks free, but the 
nontrivial dynamics comes from the constraint. The constant c s (which is determined in 
terms of the microscopic variable J) is the spin wave velocity, as you can see by writing 
down the equation of motion (d 2 /dt 2 )n — c 2 V 2 n — 0 . 

But you can feel that something is wrong. You learned in a quantum mechanics course 
that the dynamics of a spin variable S is first order in time. Consider the most basic 
example of a spin in a constant magnetic field described by H — fiS ■ B. Then dS/dt — 
i[H , 5] = fj.B x S. Besides, you might remember from a solid state physics course that in 
a ferromagnet the dispersion relation of the spin wave has the nonrelativistic form a> oc k 2 
and not the relativistic form co 2 oc k 2 implied by ( 2 ). 

The resolution of this apparent paradox is based on the Pauli-Hopf identity: Given a unit 
vector n we can always write n — z^dz, where z = ( ^) consists of two complex numbers 
such that zfz = zjzj + z^z 2 = 1. Verify this! (A mathematical aside: Writing Zj and z 2 out 
in terms of real numbers we see that this defines the so-called Hopf map S 3 —> S 2 .) While 
we cannot form a term quadratic in n and linear in time derivative, we can write a term 
quadratic in the complex doublet z and linear in time derivative. Can you figure it out 
before looking at the next line? 

The correct version of (1) is 


V 


.t d z j 1_ 

' dt 2g 2 t- 1 dt dt 


J E 

<ij> 


(3) 


The added term is known as the Berry’s phase term and has deep topological meaning. 
You should derive the equation of motion using the identity 


Remarkably, although z Udzj/dt) cannot be written simply in terms of Ti j , its variation 
can be. 


Low energy modes in the ferromagnet and the antiferromagnet 

In the ground state of a ferromagnet, the magnetic moments all point in the same 
direction, which we can choose to be the z-direction. Expanding the equation of motion in 
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small fluctuations around this ground state n . — e z + Sri: (where evidently e, denotes the 
appropriate unit vector) and Fourier transforming, we obtain 


— jj “F h(k) 


-£+*(*) 


Sn x (k) 

S?i y (k) 


= 0 


(5) 


linking the two components Sn x (k ) and Sn y (k) of Sn{k). The condition h i - « , = 1 says that 
Sn z (k) = 0. Here a is the lattice spacing and h(k) = AJ[2 — cos (k x a) — cos(k y a)] — 2J a 2 k 2 
for small k. (I am implicitly working in two spatial dimensions as evidenced by k x and k y .) 

At low frequency the Berry term iw dominates the naive term to 2 /g 2 , which we can 
therefore throw away. Setting the determinant of the matrix equal to zero, we see that we 
get the correct quadratic dispersion relation to oc k 2 . 

The treatment of the antiferromagnet is interestingly different. The so-called Neel state 1 
for an antiferromagnet is defined by Ti: = (— l) J e,. Writing n ,■ — (—l) J e z + Sri j, we obtain 


-? + /(*) 


— liar 


jico 


-J2 +f(k) 


=0 


( 6 ) 


8n x (k) 
v 8n y (k + Q ) 

linking Sn x and Sn y evaluated at different momenta. Here f(k) = 4/ 
[2 + cos (k x a) + cos(k v a)]and Q — [n/a, n/a\. The appearance of Q is due to (— 1 ) 7 = e'® a i. 
(I will let you figure out the somewhat overly compact notation.) The antiferromagnetic 
factor (—1 y explicitly breaks translation invariance and kicks in the momentum Q when¬ 
ever it occurs. A similar equation links Sn y (k ) and Sn x (k + Q). Solving these equations, 
you will find that there is a high frequency branch that we are not interested in and a low 
frequency branch with the linear dispersion to oc k. 

Thus, the low frequency dynamics of the antiferromagnet can be described by the 
nonlinear a model (2), which when the spin wave velocity is normalized to 1 can be written 
in the relativistic form: 


C = —-3 u n ■ d^ri 
2 g 2 ' 


(7) 


Exercises 

VI. 5.1 Work out the two branches of the spin wave spectrum in the ferromagnetic case, paying particular 
attention to the polarization. 

VI. 5.2 Verify that in the antiferromagnetic case the Berry’s phase term merely changes the spin wave velocity 
and does not affect the spectrum qualitatively as in the ferromagnetic case. 


1 Note that while the Neel state describes the lowest energy configuration for a classical antiferromagnet, it 
does not describe the ground state of a quantum antiferromagnet. The terms S+ SJ + S~S+ in the Hamiltonian 
J 5Z< I y> $i ‘ Sj spins up and down. 



Surface Growth and Field Theory 


V I 




In this chapter I will discuss a topic, rather unusual for a field theory text, taken from non¬ 
equilibrium statistical mechanics, one of the hottest growth fields in theoretical physics 
over the last few years. I want to introduce you to yet another area in which field theoretic 
concepts are of use. 

Imagine atoms being deposited randomly on some surface. This is literally how some 
novel materials are grown. The height h (x, t) of the surface as it grows is governed by the 
Kardar-Parisi-Zhang equation 

dh t A, t 

— = vV 2 h - 1—(V/ 7) 2 + r)(x, t) (1) 

3 1 2 

This equation describes a deceptively simple prototype of nonequilibrium dynamics and 
has a remarkably wide range of applicability. 

To understand (1), consider the various terms on the right-hand side. The term vV 2 /i 
(with v > 0) is easy to understand: Positive in the valleys of h and negative on the peaks, 
it tends to smooth out the surface. With only this term the problem would be linear and 
hence trivial. The nonlinear term (A./2)(V/i) 2 renders the problem highly nontrivial and 
interesting; I leave it to you as an exercise to convince yourself of the geometric origin of 
this term. The third term describes the random arrival of atoms, with the random variable 
Tj(x, t) usually assumed to be Gaussian distributed, with zero mean, 1 and correlations 

t)t](x', t')} = 2 a 2 S D (x - x')S(t - t') (2) 

In other words, the probability distribution for a particular t](x, t) is given by 

r, . - h f d D xdt n(x,t ) 2 

Here x represents coordinates in Z)-dimensional space. Experimentally, D — 2 for the 
situation I described, but theoretically we are free to investigate the problem for any D. 


1 There is no loss of generality here since an additive constant in rj can be absorbed by the shift h —»■ h + ct. 
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Typically, condensed matter physicists are interested in calculating the correlation be¬ 
tween the height of the surface at two different positions in space and time: 

mx, 0 - h(x', t')] 2 ) = \x -x'\ 2x f , (3) 

The bracket (• • •} here and in (2) denotes averaging over different realizations of the 
random variable r/(x, t). On the right-hand side of (3) I have written the dynamic scaling 
form typically postulated in condensed matter physics, where / and z are the so-called 
roughness and dynamic exponents. The challenge is then to show that the scaling form is 
correct and to calculate / and z. Note that the dynamic exponent z (which in general is not 
an integer) tells us, roughly speaking, how many powers of space is worth one power of 
time. (For X — 0 we have simple diffusion for which z = 2.) Here / denotes an unknown 
function. 

I will not go into more technical details. Our interest here is to see how this problem, 
which does not even involve quantum mechanics, can be converted into a quantum field 
theory. Start with 


Z = 


f Vh j Vr, e -^f dDxd,,1& ’ t)2 S 


— - vV 2 li - -(Vh) 2 - r/(x,t) 
dt 2 


Integrating over tj, we obtain Z — f Vh e Sih) with the action 


1 

2 a 2 


/ d D x dt 

— - dV 2 /i - -(Vh) 2 

J 

.dt 2 


(4) 


(5) 


You will recognize that this describes a nonrelativistic field theory of a scalar field h(x, t). 
The physical quantity we are interested in is then given by 


{[h(x, t) - h(x', t')f) = i J Vh e~ s(h) [h(x, t ) - h(x’, t')] 2 


( 6 ) 


Thus, the challenge of determing the roughness and dynamic exponents in statistical 
physics is equivalent to the problem of determining the propagator 


D(x, t) = i J Vh e~ Sih) h(x, t)h( 0 , 0 ) 


of the scalar field h. 

Incidentally, by scaling t —>• t/v and h —> ^/a 2 Jv h, we can write the action as 

"12 


S(h) = ^ J d u x dt 


— -V 2 )h- V/7) 2 

dt 


( 7 ) 


with g 2 = X 2 a 2 /v . Expanding the action in powers of h as usual 


S(h) = ^ J d D x dt 


dt 


-V l )h 




- g(Vhf 


dt 


-V 2 ) h+ ^-(V/i) 4 


( 8 ) 


we recognize the quadratic term as giving us the rather unusual propagator l/(&> 2 + k 4 ) for 
the scalar field h, and the cubic and quartic term as describing the interaction. As always. 
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Figure VI.6.1 


to calculate the desired physical quantity we evaluate the functional or “path” integral 

^ J g— *S (^)+f d D xdt J(x,t)h(x,t) 


and then functionally differentiate repeatedly with respect to J. 

My intent here is not so much to teach you nonequilibrium statistical mechanics as to 
show you that quantum field theory can emerge in a variety of physical situations, including 
those involving only purely classical physics. Note that the “quantum fluctuations” here 
arise from the random driving term. Evidently, there is a close methodological connection 
between random dynamics and quantum physics. 


Exercises 

VI . 6.1 An exercise in elementary geometry: Draw a straight line tilted at an angle 6 with respect to the horizontal. 
The line represents a small segment of the surface at time t. Now draw a number of circles of diameter 
d tangent to and on top of this line. Next draw another line tilted at angle 6 with respect to the horizontal 
and lying on top of the circles, namely tangent to them. This new line represents the segment of the 
surface some time later (see fig. VI.6.1). Note that Ah =d /cos 6 ~ d(l + \ 0 2 ). Show that this generates 
the nonlinear term (A/2)(V/i ) 2 in the KPZ equation (1). For applications of the KPZ equation, see for 
example, T. Halpin-Healy and Y.-C. Zhang, Phys. Rep. 254: 215, 1995; A. L. Barabasi and H. E. Stanley, 
Fractal Concepts in Surface Growth. 

VI. 6.2 Show that the scalar field h has the propagator l/(co 2 + k A ). 

VI. 6.3 Field theory can often be cast into apparently rather different forms by a change of variable. Show that 
by writing U = e^ gh we can change the action (7) to 

5 = Z J d D x dt (u-'^U - U~ l V 2 fj (9) 


a kind of nonlinear a model. 





Disorder: Replicas and Grassmannian Symmetry 


V I 


./ 


Impurities and random potential 

An important area in condensed matter physics involves the study of disordered systems, 
a subject that has been the focus of a tremendous amount of theoretical work over the 
last few decades. Electrons in real materials scatter off the impurities inevitably present 
and effectively move in a random potential. In the spirit of this book I will give you a brief 
introduction to this fascinating subject, showing how the problem can be mapped into a 
quantum field theory. 

The prototype problem is that of a quantum particle obeying the Schrodinger equa¬ 
tion Hi]/ = [—V 2 + V(x)]i — Ex//, where V(x) is a random potential (representing the 
impurities) generated with the Gaussian white noise probability distribution P(V) = 
Afe~ f d ' (1 / 2s , ' /( ' ) with the normalization factor AT determined by / DV P(V) — 1. The 
parameter g measures the strength of the impurities: the larger g, the more disordered the 
system. This of course represents an idealization in which interaction between electrons 
and a number of other physical effects are neglected. 

As in statistical mechanics we think of an ensemble of systems each of which is charac¬ 
terized by a particular function V (x) taken from the distribution P{V). We study the aver¬ 
age or typical properties of the system. In particular, we might want to know the averaged 
density of states defined by p(E) — (tr S(E — H )) = (JT S(E — E ,■)}, where the sum runs 
over the ith eigenstate of // with corresponding eigenvalue E t . We denote by (O(V)) = 
/ DV P(V)0(V) the average of any functional 0(V) of V(x). Clearly, f^* +SE dE p(E) 
counts the number of states in the interval from E* to E* + SE, an important quantity in, 
for example, tunneling experiments. 
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Anderson localization 

Another important physical question is whether the wave functions at a particular energy 
E extend over the entire system or are localized within a characteristic length scale £(E). 
Clearly, this issue determines whether the material is a conductor or an insulator. At first 
sight, you might think that we should study 

S(x, y; E ) = S(E - E j )f*ix)f i (y)) 

i 

which might tell us how the wave function at x is correlated with the wave function at 
some other point y, but S is unsuitable because i/f*(x)i/q(}’) has a phase that depends on 
V. Thus, S would vanish when averaged over disorder. Instead, the correct quantity to 
study is 

K(x - y; E) = 8{E - 

i 

since if/* (x)if/j iy)if/* (y)t/q 00 is manifestly positive. Note that upon averaging over all possi¬ 
ble V (x) we recover translation invariance so that K does not depend on x and y separately, 
but only on the separation \x — y|. As \x — y\ —*■ 00 , if K(x — y; E) ~ e - b-:y|/{(.E) decreases 
exponentially the wave functions around the energy E are localized over the so-called local¬ 
ization length £(E). On the other hand, if K{x — y; E) decreases as a power law of \x — y\, 
the wave functions are said to be extended. 

Anderson and his collaborators made the surprising discovery that localization prop¬ 
erties depend on D, the dimension of space, but not on the detailed form of P(V) (an 
example of the notion of universality). For D = 1 and 2 all wave functions are localized, 
regardless of how weak the impurity potential might be. This is a highly nontrivial state¬ 
ment since a priori you might think, as eminent physicists did at the time, that whether the 
wave functions are localized or not depends on the strength of the potential. In contrast, for 
D — 3, the wave functions are extended for E in the range {—E c , E c ) . As E approaches the 
energy E c (known as the mobility edge) from above, the localization length £(E) diverges 
as £(E) ~ 1 HE — E c ) 11 with some critical exponent 1 //.Anderson received the Nobel Prize 
for this work and for other contributions to condensed matter theory. 

Physically, localization is due to destructive interference between the quantum waves 
scattering off the random potential. 

When a magnetic field is turned on perpendicular to the plane of a D — 2 electron gas 
the situation changes dramatically: An extended wave function appears at E — 0. For non¬ 
zero E , all wave functions are still localized, but with the localization length diverging as 
£JE) ~ \/\E\ v . This accounts for one of the most striking features of the quantum Hall 
effect (see chapter VI.2): The Hall conductivity stays constant as the Fermi energy increases 
but then suddenly jumps by a discrete amount due to the contribution of the extended state 

1 This is an example of a quantum phase transition. The entire discussion is at zero temperature. In contrast 
to the phase transition discussed in chapter V.3, here we vary E instead of the temperature. 
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as the Fermi energy passes through E = 0. Understanding this behavior quantitatively 
poses a major challenge for condensed matter theorists. Indeed, many consider an analytic 
calculation of the critical exponent v as one of the “Holy Grails” of condensed matter theory. 


Green’s function formalism 

So much for a lightning glimpse of localization theory. Fascinating though the localization 

transition might be, what does quantum field theory have to do with it? This is after all a 

field theory text. Before proceeding we need a bit of formalism. Consider the so-called 

Green’s function G(z) = (tr[l/(z — H)]) in the complex z-plane. Since tr[l/(z — H)] — 

JT l/(z — Ej), this function consists of a sum of poles at the eigenvalues E t . Upon 

averaging, the poles merge into a cut. Using the identity (1.2.13) lim Im[l/(x + is)] = 

> .• o 

— 7tS(x), we see that 

p(E) = — — lim Im G(E + is) (1) 

71 £—>0 

So if we know G(z) we know the density of states. 


The infamous denominator 

I can now explain how quantum field theory enters into the problem. We start by taking 
the logarithm of the identity (A. 15) 

yf • K- 1 ■ J = log (J D(pe^ t ' K ^ +jt " fi+,fif ' J ) 

(where as usual we have dropped an irrelevant term). Differentiating with respect to J f 
and J and then setting J 1 and J equal to 0 we obtain an integral representation for the 
inverse of a hermitean matrix: 

fD<ptDq>e-*'- K -*q> i <p} 

(K~ l ) u =- T -- (2) 

(Incidentally, you may recognize this as essentially related to the formula (1.7.14) for the 
propagator ofa scalar field.) Now that we know how to represent l/(z — H) we have to take 
its trace, which means setting i = j in (2) and summing. In our problem, H — — V 2 + V (x) 
and the index i corresponds to the continuous variable x and the summation to an 
integration over space. Replacing K by i (z — H) (and taking care of the appropriate delta 
function) we obtain 

^ -i _ f D \f 

Z-H J ' / D(p^D(p e i f d D x{d<pla<p+[V(x)-ztyl<p) 
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This is starting to look like a scalar field theory in Z)-dimensional Euclidean space with the 
action S — f d D x{d(p' ( d<p + [V(x) — z^'cp}. [Note that for (3) to be well defined z has to be 
in the lower half-plane.] 

But now we have to average over V(x), that is, integrate over V with the probability 
distribution P(V). We immediately run into the difficulty that confounded theorists for 
a long time. The denominator in (3) stops us cold: If that denominator were not there, 
then the functional integration over V ( x ) would just be the Gaussian integral you have 
learned to do over and over again. Can we somehow lift this infamous denominator into 
the numerator, so to speak? Clever minds have come up with two tricks, known as the 
replica method and the supersymmetric method, respectively. If you can come up with 
another trick, fame and fortune might be yours. 


Replicas 

The replica trick is based on the well-known identity (1/x) = lim which allows us to 

Yl—¥■ 0 

write that much disliked denominator as 

lim( J D^Dcpe 1 f 

= lim J 

a=2 

Thus (3) becomes 

tr — l — = lim i J d D y J Dcp^Dip^ e f d M ~ dfPa ‘ Pa ^ (4) 

Note that the functional integral is now over n complex scalar fields cp a . The field <p has 
been replicated. For positive integers, the integrals in (4) are well defined. We hope that 
the limit n —> 0 will not blow up in our face. 

Averaging over the random potential, we recover translation invariance; thus the in¬ 
tegrand for f d D y does not depend on y and f d D y just produces the volume V of the 
system. Using (A.13) we obtain 

jtr -j = iV lim j ^]~[ D(p\D(p\ e f rfD * £ <p 1 (0)y>J(0) (5) 

where 

n . 2 / n \ ^ 

C(<p) = J2 {d Va d Va - z <pI<Po) + l ~Y ( J2 VaVa ) ( 6 ) 

a= 1 L \a=l / 

We obtain a field theory (with a peculiar factor of i ) of n scalar fields with a good old 
< p 4 interaction invariant under O(n) (known as the replica symmetry.) Note the wisdom of 
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replacing K by i (z — H); if we didn’t include the i the functional integral would diverge 
at large <p, as you can easily check. For z in the upper half-plane we would replace K by 
—i(z — H). The quantity from which we can extract the desired averaged density of states 
is given by the propagator of the scalar field. Incidentally, we can replace (pi(O)cpl (0) in (5) 
by the more symmetric expression (1 /n) Ylh=i ^iWb- 

Absorbing V so that we are calculating the density of states per unit volume, we find 

G(z) = i lim f (fl DvtDcpA e is '<*> ^ £ ^(0)^(0) j (7) 

For positive integer n the field theory is perfectly well defined, so the delicate step in the 
replica approach is in taking the n —» 0 limit. There is a fascinating literature on this limit. 
(Consult a book devoted to spin glasses.) 

Some particle theorists used to speak disparagingly of condensed matter physics as dirt 
physics, and indeed the influence of impurities and disorder on matter is one of the central 
concerns of modern condensed matter physics. But as we see from this example, in many 
respects there is no mathematical difference between averaging over randomness and 
summing over quantum fluctuations. We end up with a yr field theory of the type that many 
particle theorists have devoted considerable effort to studying in the past. Furthermore, 
Anderson’s surprising result that for D — 2 any amount of disorder, no matter how small, 
localizes all states means that we have to understand the field theory defined by (6) in 
a highly nontrivial way. The strength of the disorder shows up as the coupling g 2 , so 
no amount of perturbation theory in g 2 can help us understand localization. Anderson 
localization is an intrinsically nonperturbative effect. 


Grassmannian approach 


As I mentioned earlier, people have dreamed up not one, but two, tricks in dealing with 
the nasty denominator. The second trick is based on what we learned in chapter II.5 on 
integration over Grassmann variables: Let ij(x) and rj(x) be Grassmann fields, then 

)rje~ f d * x ' ,K " = C det K = C ( f DipD^e - f d * x ^ 


I 


Dr/Dije 


where C and C are two uninteresting constants that we can absorb into the definition of 
Di]Dy). With this identity we can write (3) as 

tr —L_=ij d D y J Dy>tZ),pZ)/ ? Di)e'7 A{{3 ^+ [v( ^ ) - z ]^)+( 3 ^''+ [ ^ ( ^ ) - z ]w)V(y) 1 pf(y) (8) 

and then easily average over the disorder to obtain (per unit volume) 

tr—^ ) = ' j ^DvDnDrte 1 f (9) 


with 


C(r ), r], (p ] , (p) = d(p'd(p + drjdrj — z(<p [ <p + rjrj) H- (<p ] <p + ^???) 




^2 


( 10 ) 
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We end up with a field theory with bosonic (commuting) fields <p' and (p and fermionic 
(anticommuting) fields fj and ?; interacting with a strength determined by the disorder. The 
action S exhibits an obvious symmetry rotating bosonic fields into fermionic fields and 
vice versa, and hence this approach is known in the condensed matter physics community 
as the supersymmetric method. (It is perhaps worth emphasizing that rj and r] are not 
spinor fields, which we underline by not writing them as 1 Jr and x[r. The supersymmetry 
here, perhaps better referred to as Grassmannian symmetry, is quite different from the 
supersymmetry in particle physics to be discussed in chapter VIII.4.) 

Both the replica and the supersymmetry approaches have their difficulties, and I was 
not kidding when I said that if you manage to invent a new approach without some of these 
difficulties it will be met with considerable excitement by condensed matter physicists. 


Probing localization 

I have shown you how to calculate the averaged density of states p(E). How do we study 
localization? I will let you develop the answer in an exercise. From our earlier discussion 
it should be clear that we have to study an object obtained from (3) by replacing <p(y)cp ' (y) 
by ip(x)(p f (y)(p(y)<p f (x). If we choose to think of the replica field theory in the language of 
particle physics as describing the interaction of some scalar meson, then rather pleasingly, 
we see that the density of states is determined by the meson propagator and localization 
is determined by meson-meson scattering. 


Exercises 


VI- 7.1 Work out the field theory that will allow you to study Anderson localization. [Hint: Consider the object 

<v -- v> (^w) <v -^ 

for two complex numbers z and w. You will have to introduce two sets of replica fields, commonly denoted 
by (p+ and (p~ .] {Notation: [l/(z — H)](x , y ) denotes the xy element of the matrix or operator [l/(z — //)].} 


VI. 7.2 As another example from the literature on disorder, consider the following problem. Place N points 
randomly in a D-dimensional Euclidean space of volume V. Denote the locations of the points by 
x t (i = 1, ..., N). Let 

, f d D k e rrx 
X J (2n) D k 2 + m 2 

Consider the N by N matrix = f (x t - — Xj) . Calculate p(E), the density of eigenvalues of H as we 
average over the ensemble of matrices, in the limit N —> oo, V —> oo, with the density of points p=N/V 
(not to be confused with p(E) of course) held fixed. [Hint: Use the replica method and arrive at the field 
theory action 


S(cp) = 


-i 


d D x 


.a= 1 


+ m 2 \(p a \ 2 ) — pe 


This problem is not entirely trivial; if you need help consult M. Mezardet al., Nucl. Phy. B559: 689, 2000, 
cond-mat/9906135. 
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Renormalization Group Flow as a Natural Concept 
in High Energy and Condensed Matter Physics 


Therefore, conclusions based on the renormalization 
group arguments . . . are dangerous and must be viewed 
with due caution. So is it with all conclusions from local 
relativistic field theories. 

—J. Bjorken and S. Drell, 1965 


It is not dangerous 

The renormalization group represents the most important conceptual advance in quantum 
field theory over the last three or four decades. The basic ideas were developed simultane¬ 
ously in both the high energy and condensed matter physics communities, and in some 
areas of research renormalization group flow has become part of the working language. 

As you can easily imagine, this is an immensely rich and multifaceted subject, which we 
can discuss from many different points of view, and a full exposition would require a book 
in itself. Unfortunately, there has never been a completely satisfactory and comprehensive 
treatment of the subject. The discussions in some of the older books are downright 
misleading and confused, such as the well-known text from which I learned quantum field 
theory and from which the quote above was taken. In the limited space available here, I 
will attempt to give you a flavor of the subject rather than all the possible technical details. 
I will first approach it from the point of view of high energy physics and then from that 
of condensed matter physics. As ever, the emphasis will be on the conceptual rather than 
the computational. As you will see, in spite of the order of my presentation, it is easier 
to grasp the role of the renormalization group in condensed matter physics than in high 
energy physics. 

I laid the foundation for the renormalization group in chapter III. 1—I do plan ahead! 
Let us go back to our experimentalist friend with whom we were discussing X(p 4 theory. 
We will continue to pretend that our world is described by a simple X(p 4 theory and that an 
approximation to order X 2 suffices. 
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What experimentalists insist on 


Our experimentalist friend was not interested in the coupling constant X we wrote down 
on a piece of paper, a mere Greek letter to her. She insisted that she would accept only 
quantities she and her experimental colleagues can actually measure, even if only in 
principle. As a result of our discussion with her we sharpened our understanding of what 
a coupling constant is and learned that we should define a physical coupling constant by 
[see (III.1.4)] 

X P (n) = X-3CX 2 log(^j + 0(X i ) (1) 

At her insistence, we learned to express our result for physical amplitudes in terms of 
X p (/j,), and not in terms of the theoretical construct X. In particular, we should write the 
meson-meson scattering amplitude as 

M^) +1 ° 8 0r) +log (v). 

+ 0[M/x) 3 ] ( 2 ) 


JA — — ikp (jx) + iCkpi/x) 2 


What is the physical significance of X P (n)? To be sure, it measures the strength of 
the interaction between mesons as reflected in (2). But why one particular choice of n? 
Clearly, from (2) we see that X P (ii) is particularly convenient for studying physics in the 
regime in which the kinematic variables s, t , and u are all of order fi 2 . The scattering 
amplitude is given by — iX P (fi ) plus small logarithmic corrections. (Recall from a footnote 
in chapter III.3 that the renormalization point sq = f 0 = u 0 = M 2 is adopted purely for 
theoretical convenience and cannot be reached in actual experiments. For our conceptual 
understanding here this is not a relevant issue.) In short, X P (ii) is known as the coupling 
constant appropriate for physics at the energy scale /x. 

In contrast, if we were so idiotic as to use the coupling constant X P (/T) while exploring 
physics in the regime with s, t , and u of order /x 2 , with /x' vastly different from /x, then we 
would have a scattering amplitude 


M = — iXp(fi') + iCX P (fi r ) 2 


+ 0[M/x') 3 ] 


log +log ii- +log ^ 


( 3 ) 


in which the second term [with log (/r ,2 //x 2 ) large] can be comparable to or larger than the 
first term. The coupling constant X P (n') is not a convenient choice. Thus, for each energy 
scale 11 there is an “appropriate” coupling constant X p (/j,). 

Subtracting (2) from (3) we can easily relate X P (fi) and X P (/i') for /x ~ ji ’: 


XpQi') — Xp(p.) + 3 CXpip.) 2 log + G[Ap(A0 3 ] 


( 4 ) 


We can express this as a differential “flow equation” 


/x— X P (fj J ) — 6Q P (/i) 2 -j- O(Ap) 
d/i 


( 5 ) 
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As you have already seen repeatedly, quantum field theory is full of historical misnomers. 
The description of how kpf/x) changes with fi is known as the renormalization group. 
The only appearance of a group concept here is the additive group of transformation 
/x —>■ /x + S/j.. 

For the conceptual discussion in chapter III. 1 and here, we don’t need to know what 
the constant C happens to be. If C happens to be negative, then the coupling k/fi/x) will 
decrease as the energy scale increases, and the opposite will occur if C happens to be 
positive. (In fact, the sign is positive, so that as we increase the energy scale, X P flows away 
from the origin.) 


Flow of the electromagnetic coupling 

The behavior of X is typical of coupling constants in 4-dimensional quantum field theories. 
For example, in quantum electrodynamics, the coupling e or equivalently a — e 2 /An , 
measures the strength of the electromagnetic interaction. The story is exactly as that told 
for the Xcp 4 theory: Our experimentalist friend is not interested in the Latin letter e, but 
wants to know the actual interaction when the relevant momenta squared are of the order 
/x 2 . Happily, we have already done the computation: We can read off the effective coupling 
at momentum transferred squared q 2 = /x 2 from (III.7.14): 

= el - -- e 2 [l - e 2 n (/x 2 ) + 0(e 4 )] 

1 + e z n(/Li z ) 

Take /x much larger than the electron mass m but much smaller than the cutoff mass M. 
Then from (III.7.13) 

p-j-epin) = —(/x 2 ) + 0(<? 5 ) = +—r^e p + 0(e 5 p ) (6) 

dq 2 d[i 127r z 

We learn that the electromagnetic coupling increases as the energy scale increases. 
Electromagnetism becomes stronger as we go to higher energies, or equivalently shorter 
distances. 

Physically, the origin of this phenomenon is closely related to the physics of dielectrics. 
Consider a photon interacting with an electron, which we will call the test electron to 
avoid confusion in what follows. Due to quantum fluctuations, as described way back in 
chapter 1.1, spacetime is full of electron-positron pairs, popping in and out of existence. 
Near the test electron, the electrons in these virtual pairs are repelled by the test electron 
and thus tend to move away from the test electron while the positrons tend to move toward 
the test electron. Thus, at long distances, the charge of the test electron is shielded to some 
extent by the cloud of positrons, causing a weaker coupling to the photon, while at short 
distances the coupling to the photon becomes stronger. The quantum vacuum is just as 
much a dielectric as a lump of actual material. 

You may have noticed by now that the very name “coupling constant” is a terrible 
misnomer due to the fact that historically much of physics was done at essentially one 
energy scale, namely “almost zero”! In particular, people speak of the fine structure 
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“constant” a — 1/137 and crackpots continue to try to “derive” the number 137 from 
numerology or some fancier method. In fact, a is merely the coupling “constant” of the 
electromagnetic interaction at very low energies. It is an experimental fact that a, more 
properly written as a P (/a) = e 2 p (n)/A-n, varies with the energy scale /x we are exploring. 
But alas, we are probably stuck with the name “coupling constant.” 


Renormalization group flow 

In general, in a quantum field theory with a coupling constant g, we have the renormal¬ 
ization group flow equation 

= m ( 7 ) 

tf/x 

which is sometimes written as dg/dt — /3(g) upon defining t = log(/x//x 0 ). I will now 
suppress the subscript P on physical coupling constants. If the theory happens to have 
several coupling constants gj, i = 1, • • •, TV, then we have 

$=A(*1. ■■•.«*) (8) 

at 

We can think of (g 1; • • • , g N ) as the coordinate of a particle in TV-dimensional space, 
t as time, and //(ki, • • • » gN) a position dependent velocity field. As we increase \jl or t 
we would like to study how the particle moves or flows. For notational simplicity, we will 
now denote (gj, • • •, g N ) collectively as g. Clearly, those couplings at which /3 t (g*) (for all 
i ) happen to vanish are of particular interest: g* is known as a fixed point. If the velocity 
field around a fixed point g* is such that the particle moves toward that point (and once 
reaching it stays there since its velocity is now zero) the fixed point is known as attractive or 
stable. Thus, to study the asymptotic behavior of a quantum field theory at high energies 
we “merely” have to find all its attractive fixed points under the renormalization group 
flow. In a given theory, we can typically see that some couplings are flowing toward larger 
values while others are flowing toward zero. 

Unfortunately, this wonderful theoretical picture is difficult to implement in practice 
because we essentially have no way of calculating the functions (g ). In particular, g* could 
well be quite large, associated with what is known as a strong coupling fixed point, and 
perturbation theory and Feynman diagrams are of no use in determining the properties 
of the theory there. Indeed, we know the fixed point structure of very few theories. 

Happily, we know of one particularly simple fixed point, namely g* — 0, at which 
perturbation theory is certainly applicable. We can always evaluate (8) perturbatively: 
dgj/dt — cj k gig k + d- kl gjg k gi + • • • . (In some theories, the series starts with quadratic 
terms and in others, with cubic terms. Sometimes there is also a linear term.) Thus, as we 
have already seen in a couple of examples, the asymptotic or high energy behavior of the 
theory depends on the sign of /?,■ in (8). 

Let us now join the film “Physics History” already in progress. In the late 1960s, 
experimentalists studying the so-called deep inelastic scattering of electrons on protons 
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discovered that their data seemed to indicate that after being hit by a highly energetic 
electron, one of the quarks inside the proton would propagate freely without interacting 
strongly with the other quarks. Normally, of course, the three quarks inside the proton are 
strongly bound to each other to form the proton. Eventually, a few theorists realized that 
this puzzling state of affairs could be explained if the theory of strong interaction is such 
that the coupling flows toward the fixed point g* = 0. If so, then the strong interaction 
between quarks would actually weaken at higher and higher energy scales. 

All of this is of course now “obvious” with the benefit of hindsight, but dear students, 
remember that at that time field theory was pronounced as possibly unsuitable for young 
minds and the renormalizable group was considered “dangerous” even in a field theory 
text! 

The theory of strong interaction was unknown. But if we were so bold as to accept the 
dangerous renormalization group ideas then we might even find the theory of the strong 
interaction by searching for asymptotically free theories, which is what theories with an 
attractive fixed point at g* = 0 became known as. 

Asymptotically free theories are clearly wonderful. Their behavior at high energies can 
be studied using perturbative methods. And so in this way the fundamental theory of the 
strong interaction, now known as quantum chromodynamics, about which more later, was 
found. 


Looking at physics on different length scales 

The need for renormalization groups is really transparent in condensed matter physics. 
Instead of generalities, let me focus on a particularly clear example, namely surface 
growth. Indeed, that was why I chose to introduce the Kardar-Parisi-Zhang equation in 
chapter VI.6. We learned that to study surface growth we have to evaluate the functional 
or path integral 

Z(A)= / Vhe~ sm . (9) 

J A 

with, you will recall, 



This defines a field theory. As with any field theory, and as I indicate, a cutoff A has to be 
introduced. We integrate over only those field configurations h(x, t) that do not contain 
Fourier components with k and w larger than A. (In principle, since this is a nonrelativistic 
theory we should have different cutoffs for k and for co , but for simplicity of exposition let 
us just refer to them together generically as A.) The appearance of the cutoff is completely 
physical and necessary. At the very least, on length scales comparable to the size of the 
relevant molecules, the continuum description in terms of the field h(x, t) has long since 
broken down. 

Physically, since the random driving term r\(x, t) is a white noise, that is, ?; at x. and 
at x' (and also at different times) are not correlated at all, we expect the surface to look 
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(b) 


Figure VI.8.1 


very uneven on a microscopic scale, as depicted in figure VI.8. la. But suppose we are not 
interested in the detailed microscopic structure, but more in how the surface behaves on 
a larger scale. In other words, we are content to put on blurry glasses so that the surface 
appears as in figure VI.8.lb. This is a completely natural way to study a physical system, 
one that we are totally familiar with from day one in studying physics. We may be interested 
in physics over some length scale L and do not care about what happens on length scales 
much less than L. 

The renormalization group is the formalism that allows us to relate the physics on differ¬ 
ent length scales or, equivalently, physics on different energy scales. In condensed matter 
physics, one tends to think of length scales, and in particle physics, of energy scales. The 
modern approach to renormalization groups came out of the study of critical phenomena 
by Kadanoff, Fisher, Wilson, and others, as mentioned in chapter V.3. Consider, for exam¬ 
ple, the Ising model, with the spin at each site either up or down and with a ferromagnetic 
interaction between neighboring spins. At high temperatures, the spins point randomly 
up and down. As the temperature drops toward the ferromagnetic transition point, islands 
of up spins (we say up spins to be definite, we could just as easily talk of down spins) start 
to appear. They grow ever larger in size until the critical temperature T c at which all the 
spins in the entire system point up. The characteristic length scale of the physics at any 
particular temperature is given by the typical size of the islands. The physically motivated 
block spin method of Kadanoff et al. treats blocks of up spin as one single effective up 
spin, and similarly blocks of down spins. The notion of a renormalization group is then 
the natural one for describing these effective spins by an effective Hamiltonian appropriate 
to that length scale. 

It is more or less clear how to implement this physical idea of changing length scales in 
the functional integral (9). We are supposed to integrate over those h(k, co), with k and u> 
less than A. Suppose we do only a fraction of what we are supposed to do. Let us integrate 



362 | VI. Field Theory and Condensed Matter 

over those h(k, a >) with k and <w larger than A — <5 A but smaller than A. This is precisely 
what we mean when we say that we don’t care about the fluctuations of h(x, t) on length 
and time scales less than (A — <5A) -1 . 

Putting on blurry glasses 

For the sake of simplicity, let us go back to our favorite, the X(p 4 theory, instead of the surface 
growth problem. Recall from the preceding chapters the importance of the Euclidean Xcp 4 
theory in modern condensed matter theory. So, continue the Xcp 4 theory to Euclidean space 
and stare at the integral 

Z(A)= f V<pe~ f (11) 

J A 

The notation f A instructs us to include only those field configurations (p(x) — 
f [d d k/(2n) d ]e lkx <p(k) such that (p(k) = 0 for |k| = (Yjj-i kj) 1 larger than A. As explained 
in the text this amounts to putting on blurry glasses with resolution L = 1/A: We do not 
admit or see fluctuations with length scales less than L. 

Evidently, the O id) invariance, namely the Euclidean equivalent of Lorentz invariance, 
will make our lives considerably easier. In contrast, for the surface growth problem we will 
need special glasses that blur space and time differently. 1 

We are now ready to make our glasses blurrier by letting A -> A — S A (with S A > 0). 
Write (p — cp s + (p w (s for “smooth” and w for “wriggly”), defined such that the Fourier 
components ( p s (k ) and <p w (k) are nonzero only for \k\ < (A — S A) and (A — S A) < |k| < 
A, respectively. (Obviously, the designation “smooth” and “wriggly” is for convenience.) 
Plugging into (11) we can write 

Z(A) = f V<p s e~ / ddxa ^> f V(p m e~ f ddxC ^.<Pw) (i 2 ) 

Ja-sa J 

where all the terms in £,(%, Vw) depend on <p w . (What we are doing here is somewhat 
reminiscent of what we did in chapter IV.3.) Imagine doing the integral over <p w . Call the 
result 

g-f d d xSC((p s ) = J -py e ~f d d xCi(<p s ,(p w ) 

and thus we have 

Z(A)= f D^g- f dd *l£(<Ps)+sc(.<Ps)] (13) 

JA-SA 

There, we have done it! We have rewritten the theory in terms of the “smooth” field (p s . 

Of course, this is all formal, since in practice the integral over (p w can only be done 
perturbatively assuming that the relevant couplings are small. If we could do the integral 

1 In condensed matter physics, the so-called dynamical exponent z measures this difference. More precisely, 
in the context of the surface growth problem, the correlator (introduced in chapter VI.6) satisfies the dynamic 
scaling form given in (VI.6.3). Naively, the dynamical exponent z should be 2. (For a brief review of all this, see 
M. Kardar and A. Zee, Nucl. Phys. B464[FS]: 449, 1996, cond-mat/9507112.) 
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over p w exactly, we might as well just do the integral over p and then we would have no 
need for all this renormalization group stuff. 

For pedagogical purposes, consider more generally C — jidp) 2 + X n p" + ■ ■ ■ (so 
that X 2 is the usual \m 2 and A 4 the usual A.) Since terms such as dp s dp w integrate to zero, 
we have 

J d d xC 1 (p s , p w ) = J d d x Q(3 pj 2 + ^m 2 pl + • • 

with p s hiding in the (• • •)• This describes a field p w interacting with both itself and a 
background field p s (x). By symmetry considerations SC(p s ) has the same form as C(p s ) 
but with different coefficients. Adding SC(p s ) to C(p s ) thus shifts 2 the couplings A„ [and the 
coefficient of j (dp s ) 2 .} These shifts generate the flow in the space of couplings I described 
earlier. 

We could have perfectly well left (13) as our end result. But suppose we want to compare 
(13) with (11). Then we would like to change the f A _ SA in (13) to J A . For convenience, 
introduce the real number b < 1 by A — 8A — b A. In f A _ SA we are told to integrate over 
fields with \k\ < b A. So all we have to do is make a trivial change of variable: Let k — bk' 
so that \k'\ < A. But then correspondingly we have to change x = x'lb so that e ,kx — e ,kx '. 
Plugging in, we obtain 


J d‘ l xC(p s ) = f d d x'b-“ h 2 oV s ) 2 + J] x,y s + 


(14) 


where d' = d/dx' = {\/b)d/dx. Define p' by b 2 d {d'p s ) 2 = ( B'p') 2 or in other words p f — 
bi {2 ~ d) p s . Then (14) becomes 




-(9V) 2 + X„b- d+{n/2 ^ d - 2) p' n + ■ 


Thus, if we define the coefficient of p' n as A' we have 


X' = b in ' 2 ^ d - 2) - d x n 

n n 


(15) 


an important result in renormalization group theory. 


Relevant, irrelevant, and marginal 

Let us absorb what this means. (For the time being, let us ignore SC(p s ) to keep the 
discussion simple.) As we put on blurrier glasses, in other words, as we become interested 
in physics over longer distance scales, we can once again write Z( A) as in (11) except that 
the couplings A„ have to be replaced by X f n . Since b < 1 we see from (15) that the A„’s with 
(n/2)(d — 2) — d > 0 get smaller and smaller and can eventually be neglected. A dose of 
jargon here: The corresponding operators p n (for historical reasons we revert for an instant 


2 Terms such as (dip) 4 can also be generated and that is why I wrote C(ip) with the (• ■ •) under which terms such 
as these can be swept. You can check later that for most applications these terms are irrelevant in the technical 
sense to be defined below. 
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from the functional integral language to the operator language) are called irrelevant. They 
are the losers. Conversely, the winners, namely the <p n ’s for which ( n/2)(d — 2) — d < 0, 
are called relevant. Operators for which (n/2)(d — 2) — d — 0 are called marginal. 

For example, take n — 2 : m' 2 — b~ 2 m 2 and the mass term is always relevant in any 
dimension. On the other hand, take n — 4, and we see that X' — b d ~ 4 X and <p 4 is relevant 
for d < 4, irrelevant for d > 4, and marginal at d = 4. Similarly, X' b = b 2d ~ 6 X and q > 6 is 
marginal at d = 3 and becomes irrelevant for d > 3. 

We also see that d — 2 is special: All the <p n ’s are relevant. 

Now all this may ring a bell if you did the exercises religiously. In exercise III.2.1 
you showed that the coupling X„ has mass dimension [A.„] = (n/ 2)(2 — d) + d. Thus, the 
quantity (n/2)(d — 2) — d is just the length dimension of X n . For example, for d = 4, 
X 6 has mass dimension —2 and thus as explained in chapter III.2 the tp 6 interaction is 
nonrenormalizable, namely that it has nasty behavior at high energy. But condensed matter 
physicists are interested in the long distance limit, the opposite limit from the one that 
interests particle physicsts. Thus, it is the nasty guys like tp 6 that become irrelevant in the 
long distance limit. 

One more piece of jargon: Given a scalar field theory, the dimension d at which the most 
relevant interaction becomes marginal is known as the critical dimension in condensed 
matter physics. For example, the critical dimension for a <p 6 theory is 3. It is now just 
a matter of “high school arithmetic” to translate (15) into differential form. Write X' — 
X n + SX n ; then from b — 1 — (<5A/A) we have SX n = ~[\{d — 2) — d]X n (S A/A). 

Let us now be extra careful about signs. As I have already remarked, for (n /2 )(d — 2) — 
d > 0 the coupling X n (which, for definiteness, we will think of as positive) get smaller, 
as is evident from (15). But since we are decreasing A to A — S A, a positive <5A actually 
corresponds to the resolution of our blurry glasses L — A -1 changing to L + L(8A/A). 
Thus we obtain 


dX n 

dL 


-(d-2)-d 

2 


( 16 ) 


so that for (n/2)(d — 2) — d > 0 a positive X n would decrease as L increases. 3 

In particular, for n — 4, L(dX/dL) — (4 — d)X. In most condensed matter physics appli¬ 
cations, d < 3 and so X increases as the length scale of the physics under study increases. 
The <p 4 coupling is relevant as noted above. 

The 8C((p s ), which we provisionally neglected, contributes an additional term, which we 
call dynamical in contrast to the geometrical or “trivial” term displayed, to the right-hand 
side of(16). Thus, in general L{dX n /dL) = —[(n/2)(d — 2) — d]X n + K(d, «,•••, Xj, ■ ■ •), 
with the dynamical term K depending not only on d and n, but also on all the other 
couplings. [For example, in (5) the “trivial” term vanishes since we are in 4-dimensional 
spacetime; there is only a dynamical contribution.] 


3 Note that what appears on the right-hand side is minus the length dimension of X n , not the length dimension 
(n/2)(d — 2) — d as one might have guessed naively. 
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As you can see from this discussion, a more descriptive name for the renormalization 
group might be “the trick of doing an integral a little bit at a time.” 


Exploiting symmetry 

To determine the renormalization group flow of the coupling g in the surface growth 
problem we can repeat the same type of computation we did to determine the flow of the 
coupling X and of e in our two previous examples, namely we would calculate, to use the 
language of particle physics, the amplitude for h-h scattering to one loop order. But instead, 
let us follow the physical picture of Kadanoffetal. InZ(A) = f Vh e~ Sih) we integrate over 
only those h(k, &>) with k and co larger than A — <5 A but less than A. 

I will now show you how to exploit the symmetry of the problem to minimize our labor. 
The important thing is not necessarily to learn about the dynamics of surface growth, 
but to learn the methodology that will serve you well in other situations. I have picked 
a particularly “difficult” nonrelativistic problem whose symmetries are not manifest, so 
that if you master the renormalization group for this problem you will be ready for almost 
anything. 

Imagine having done this partial integration and call the result / Vh e~ S{h) . At this point 
you should work out the symmetries of the problem as indicated in the exercises. Then 
you can argue that S(h) must have the form 

S(h) = j j d D x dt - jSV 2 ^ h-a | 

depending on two parameters a and /S. The (• • •) indicates terms involving higher powers 
of h and its derivatives. The simplifying observation is that the same coefficient a multiplies 
both dh/dt and (g/2)(V/z) 2 . Once we know a and /3 then by suitable rescaling we can 
bring the action S(h) back into the same form as S(h) and thus find out how g changes. 
Therefore, it suffices to look at the (dh/dt) 2 and (V 2 /z ) 2 terms in the action, or equivalently 
at the propagator, which is considerably simpler to calculate. As Rudolf Peierls once said 4 
to the young Hans Bethe, “Erst kommt das Denken, dann das Integral.” (Roughly, “First 
think, then do the integral.”) We will not do the computation here. Suffice it to note that g 
has the high school dimension of (length) 2 (D ~ 2) (see exercise VI.8.5). Thus, according to 
the preceding discussion we should have 

L C -^ = \(2-D)g + c D g i + --- (18) 

A detailed calculation is needed to determine the coefficient c D , which obviously 
depends on the dimension of space D since the Feynman integrals depend on D. 
The equation tells us how g, an effective measure of nonlinearity in the physics of 

4 John Wheeler gave me similar advice when I was a student: “Never calculate without first knowing the 
answer.” 
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surface growth, changes when we change the length scale L. For the record, c D — 
[S(D)/4(2jt) D ](2D — 3 )/D, with S(D) the Z)-dimensional solid angle. The interesting 
factor is of course (2 D — 3), changing sign between 5 D — 1 and 2. 


Localization 

As I said earlier, renormalization group flow has literally become part of the language of 
condensed matter and high energy physics. Let me give you another example of the power 
of the renormalization group. Go back to Anderson localization (chapter VI.7), which so 
astonished the community at the time. People were surprised that the localization behavior 
depends so drastically on the dimension of space D, and perhaps even more so, that for 
D — 2 all states are localized no matter how weak the strength of the disorder. Our usual 
physical intuition would say that there is a critical strength. As we will now see, both 
features are quite naturally accounted for in the renormalization group language. Already, 
you see in (18) that D enters in an essential way. 

I now offer you a heuristic but beautiful (at least to me) argument given by Abra¬ 
hams, Anderson, Licciardello, and Ramakrishnan, who as a result became known to the 
condensed matter community as the “Gang of Four.” First, you have to understand the 
difference between conductivity a and conductance G in solid state physics lingo. Conduc¬ 
tivity 6 is defined by J = a E, where J measures the number of electrons passing through 
a unit area per unit time. Conductance G is the inverse of resistance (the mnemonic: the 
two words rhyme). Resistance R is the property of a lump of material and defined in high 
school physics by V — IR, where the current I measures the number of electrons passing 
by per unit time. To relate a and G , consider a lump of material, taken to be a cube of size 
L, with a voltage drop V across it. Then I = JL 2 — aEL 2 — <j(V/L)L 2 — oLV and thus 7 
G(L) — 1/R — 1 /V — oL. Next, let us go to two dimensions. Consider a thin sheet of ma¬ 
terial of length and width L and thickness a <^L. (We are doing real high school physics, 
not talking about some sophisticated field theorist’s idea of two dimensional space!) Again, 
apply a voltage drop V over the length L : I — J (aL ) = a EaL — a (V / L)aL — a Va and so 
G(L) — 1 /R — I / V = a a. I will let you go on to one dimension: Consider a wire of length 
L and width and thickness a. In this way, we obtain G(L) oc L°~ 2 . Incidentally, condensed 
matter physicists customarily define a dimensionless conductance g(L) = HG{L)/e 2 . 


5 Incidentally, the theory is exactly solvable for D = 1 (with methods not discussed in this book). 

6 Over the years I have asked a number of high energy theorists how is it possible to obtain J = oE, which 
manifestly violates time reversal invariance, if the microscopic physics of an electron scattering on an impurity 
atom perfectly well respects time reversal invariance. Very few knew the answer. The resolution of this apparent 
paradox is in the order of limits! Condensed matter theorists calculate a frequency and wave vector dependent 
conductivity a(w, k) and then take the limit a>,k—>0 and k 2 /(i> —> 0. Before the limit is taken, time reversal 
invariance holds. The time it takes the particle to find out that it is in a box of size of order 1/k is of order 1 /(Dk 2 ) 
(with D the diffusion constant). The physics is that this time has to be much longer than the observation time 
~ 1/co. 

7 Sam Treiman told me that when he joined the U.S. Army as a radio operator he was taught that there were 
three forms of Ohm’s law: V = IR. I = V/R, and R = V/I. In the second equality here we use the fourth form. 
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We also know the behavior of g(L) when g(L) is small or, in other words, when the 
material is an insulator for which we expect g(L) ~ ce~ L ^ , with £ some length charac¬ 
teristic of the material and determined by the microscopic physics. Thus, for g(L ) small, 
L(dg/dL) = ~{L/^)g(L) — g(L)[log g(L) — log c], where the constant log c is negligible 
in the regime under consideration. 

Putting things together, we obtain 

L dg [ (D - 2) + ■ • ■ for large g 

m=~ - 77 = , f „ (19) 

g aL ( log g + ■ • • for small g 

First, a trivial note: in different subjects, people define /3(g) differently (without affecting 
the physics of course). In localization theory, /3 (g) is traditionally defined as d log g/d log L 
as indicated here. Given (19) we can now make a “most plausible” plot of yd(g) as shown 
in figure VI.8.2. You see that for D = 2 (and D — 1) the conductance g(L) always flows 
toward 0 as we go to long distances (macroscopic measurements on macroscopic materials) 
regardless of where we start. In contrast, for D = 3, if go the initial value ofg is greater than 
a critical g c then g(L) flows to infinity (presumably cut off by physics we haven’t included) 
and the material is a metal, while if go < g c , the material is an insulator. Incidentally, 
condensed matter theorists often speak of a critical dimension D c at which the long 
distance behavior of a system changes drastically; in this case D c = 2. 


Effective description 

In a sense, the renormalization group goes back to a basic notion of physics, that the 
effective description can and should change as we move from one length scale to another. 
For example, in hydrodynamics we do not have to keep track of the detailed interaction 
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among water molecules. Similarly, when we apply the renormalization group flow to the 
strong interaction, starting at high energies and moving toward low energies, the effective 
description goes from a theory of quarks and gluons to a theory of nucleons and mesons. 
In this more general picture then, we no longer think of flowing in a space of coupling 
constants, but in “the space of Hamiltonians” that some condensed matter physicists like 
to talk about. 


Exercises 


VI.8.1 Show that the solution of dg/dt = — bg 3 + ■ ■ ■ is given by 


1 

a(t) 


1 „ , 

-T- Sjt bt + • • • 

<*(0) 


where we defined a(t) = g(t) 2 /An . 


( 20 ) 


Vl.8.2 In our discussion of the renormalization group, in X(p 4 theory or in QED, for the sake of simplicity 
we assumed that the mass m of the particle is much smaller than /x and thus set m equal to zero. But 
nothing in the renormalization group idea tells us that we can’t flow to a mass scale below m. Indeed, in 
particle physics many orders of magnitude separate the top quark mass m t from the up quark mass m u . 
We might want to study how the strong interaction coupling flows from some mass scale far above m t 
down to some mass scale /x below m t but still large compared to m u . As a crude approximation, people 
often set any mass m below /x equal to zero and any m above /x to infinity (i.e., not contributing to the 
renormalization group flow). In reality, as /x approaches m from above the particle starts to contribute 
less and drops out as /x becomes much less than m. Taking either the X(p 4 theory or QED study this 
so-called threshold effect. 


Vl.8.3 


Show that (10) is invariant under the so-called Galilean transformation 
h(x, t) —> h'(x, t) = h (x + gut, t} +u ‘X + 

Show that because of this symmetry only two parameters a and /3 appear in (17). 


( 21 ) 


VI.8.4 In S(h ) only derivatives of the field h can appear and not the field itself. (Since the transformation 
h(x, t) —> h(x, t) + c with c a constant corresponds to a trivial shift of where we measure the surface 
height from, the physics must be invariant under this transformation.) Terms involving only one power 
of h cannot appear since they are all total divergences. Thus, S{h) must start with terms quadratic in h. 
Verify that the S(h ) given in (17) is indeed the most general. A term proportional to (V/z) 2 is also allowed 
by symmetries and is in fact generated. However, such a term can be eliminated by transforming to a 
moving coordinate frame h —► h + ct. 

Vl.8.5 Show that g has the high school dimension of (length) 2 ( ~ D ~ 2 ^. [Hint: The form of S ( h ) implies that t has 
the dimension of length squared and so h has the dimension (length) 2 ( 2_D ) ] Comparing the terms V 2 /z 
and g(V/z) 2 we determine the dimension of g.] 

VI.8.6 Calculate the h propagator to one loop order. Extract the coefficients of the to 2 and k 4 terms in a low 
frequency and wave number expansion of the inverse propagator and determine a and fi. 


VI.8.7 Study the renormalization group flow of g for D = 1, 2, 3. 
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VI 1.1 


Quantizing Yang-Mills Theory 
and Lattice Gauge Theory 


One reason that Yang-Mills theory was not immediately taken up by physicists is that 
people did not know how to calculate with it. At the very least, we should be able to 
write down the Feynman rules and calculate perturbatively. Feynman himself took up the 
challenge and concluded, after looking at various diagrams, that extra fields with ghostlike 
properties had to be introduced for the theory to be consistent. Nowadays we know how 
to derive this result more systematically. 

The story goes that Feynman wanted to quantize gravity but Gell-Mann suggested to 
him to first quantize Yang-Mills theory as a warm-up exercise. 

Consider pure Yang-Mills theory—it will be easy to add matter fields later. Follow what 
we have learned. Split the Lagrangian C — C 0 + C 1 as usual into two pieces (we also choose 
to scale A -> gA): 

4> = -?M-3 v a») 2 (i) 

and 

- 3 v Ap/ afcc A^A CT - y^f“bc f ade A b^ A diM A ev ( 2 ) 

Then invert the differential operator in the quadratic piece (1) to obtain the propagator. 
This part looks the same as the corresponding procedure for quantum electrodynamics, 
except for the occurrence of the index a. Just as in electrodynamics, the inverse does not 
exist and we have to fix a gauge. 

I built up the elaborate Faddeev-Popov method to quantize quantum electrodynamics 
and as I noted, it was a bit of overkill in that context. But here comes the payoff: We can now 
turn the crank. Recall from chapter III. 4 that the Faddeev-Popov method would give us 

Z = j DAe iS{A) A{A)8[f(A)] (3) 

with A(A) = {f Dgi5[/(Ag)]} -1 and 5(A) = f d 4 xC the Yang-Mills action. (As in chap¬ 
ter III.4, A g = gAg -1 — ;'(3g)g _1 denotes the gauge transform of A. Hereg = g(x) denotes 
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the group element that defines the gauge transformation at x and is obviously not to be 
confused with the coupling constant.) 

Since A (A) appears in (3) multiplied by S[f(A)], in the integral over g we expect, for 
a reasonable choice of f(A), only infinitesimal g to be relevant. Let us choose f(A) = 
3 A — a. Under an infinitesimal transformation, A a -> A a — f abc Q b A c + d,,0 a and thus 

A (A) = { J D98[dA“ - o a - d^(f abc e b A^ - 9 M 6>")]} _1 (4) 

“ = ” ij DG&id^xf^'ehA^ - a M e a )]} _1 . 

where the “effectively equal sign” follows since A (A) is to be multiplied later by <5[/(A)]. 
Let us write formally 

d* L (f abc 0b A c ii -d ll 0 a ) = J d 4 yK ab {x,y)9 b {y) (5) 

thus defining the operator K ab (x, y) — d ,l (f abc — y). Note that in con¬ 
trast to electromagnetism here K depends on the gauge potential. The elementary result 
/ dd8(Kd) — 1/K for 9 and K real numbers can be generalized to / d98(K6) — 1/ det K 
for 6 a real vector and K a nonsingular matrix. Regarding K ab (x, y) as a matrix, we ob¬ 
tain A(A) = det K , but we know from chapter II.5 how to represent the determinant as a 
functional integral over Grassmann variables: Write A(A) = f DcDc f e‘ S g h °sd ct .c), wpq 

Sghost( ct . c) = J d 4 xd 4 ycl(x)K ab (x,y)c b (y) 

= J d 4 x[dcl(x)dc a {x) - d tl cl(x)f abc A c li (x)c b (^x)] 

= J d 4 xdc%(x)Dc a (x ) (6) 

and with D the covariant derivative for the adjoint representation, to which the fields c a 
and c ] a belong just like A". The fields c a and c„ are known as ghost fields because they 
violate the spin-statistics connection: Though scalar, they are treated as anticommuting. 
This “violation” is acceptable because they are not associated with physical particles and 
are introduced merely to represent A (A) in a convenient form. 

This takes care of the A( A) factor in (3). As for the <5[/(A)] factor, we use the same trick 
as in chapter III.4 and integrate Z over cr a (x ) with a Gaussian weight f d xa (x> 

so that 8 [/(A)] gets replaced by / d x(dA ) . 

Putting it all together, we obtain 

Z = f DADcDct e iS(A)-(i/2t) f SxOAt+iS^ct.c) (7) 

with £, a gauge parameter. Comparing with the corresponding expression for an abelian 
gauge theory in chapter III.4, we see that in nonabelian gauge theories we have a ghost 
action S g h os t i n addition to the Yang-Mills action. Thus, C 0 and are changed to 

A) = ~l^A a v - 9 V A“) 2 - “(3^A") 2 + dcldc a 


( 8 ) 
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Figure VII.1.1 


and 


A = ~\g^K - d v A a )f abc A b »A cv + ^ f abc f a d e A b A c A ^ A ev _ 9 ^ c t ' gf abc A c {x) 


(9) 


We can now read off the propagators for the gauge boson and for the ghost field imme¬ 
diately from (8). In particular, we see that except for the group index a the terms quadratic 
in the gauge potential are exactly the same as the terms quadratic in the electromagnetic 
gauge potential in (III.4.8). Thus, the gauge boson propagator is 


(-0 

k 2 


Svi ~ (1 - O 


Kh 

k 2 


&ab 


( 10 ) 


Compare with (III.4.9). From the term dcadc a in (8) we find the ghost propagator to be 
( i/k 2 )S ab . 

From we see that there is a cubic and a quartic interaction between the gauge 
bosons, and an interaction between the gauge boson and the ghost field, as illustrated 
in figure VII.1.1. The cubic and the quartic couplings can be easily read off as 


gf“ hL [gnv(k\ ~ k i>x + gvx( k 2 — k 3 )// + gipi k 3 — *i)„] 


( 11 ) 


and 

-ig 2 lf abe f Cde (g^g V p ~ gppgvx) + f ade f Cbe (gplgvp ~ gpvgp\) 

+ f ace f bde {g^p ~ gp.pgvx)\ (12) 

respectively. The coupling to the ghost field is 


gf abc pd 


(13) 
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Obviously, we can exploit various permutation symmetries in writing these down. For 
instance, in (12) the second term is obtained from the first by the interchange {c, A} ** 
{d, p}, and the third and fourth terms are obtained from the first and second by the 
interchange {a, o- {c, 1}. 


Unnatural act 

In a highly symmetric theory such as Yang-Mills, perturbating is clearly an unnatural 
act as it involves brutally splitting C into two parts: a part quadratic in the fields and 
the rest. Consider, for example, an exactly soluble single particle quantum mechanics 
problem, such as the Schrodinger equation with V (x) = 1 — (1/ cosh x) 2 . Imagine writing 
V(x) = jx 2 + W{x) and treating W(x) as a perturbation on the harmonic oscillator. You 
would have a hard time reproducing the exact spectrum, but this is exactly how we brutalize 
Yang-Mills theory in the perturbative approach: We took the “holistic entity” tr F /1V F IMV and 
split it up into the “harmonic oscillator” piece tr(3 jLt A v — 3,,A^) 2 and a “perturbation.” 

If Yang-Mills theory ever proves to be exactly soluble, the perturbative approach with its 
mangling of gauge invariance is clearly not the way to do it. 


Lattice gauge theory 

Wilson proposed a way out: Do violence to Lorentz invariance rather than to gauge in¬ 
variance. Let us formulate Yang-Mills theory on a hypercubic lattice in 4-dimensional 
Euclidean spacetime. As the lattice spacing a —*■ 0 we expect to recover 4-dimensional 
rotational invariance and (by a Wick rotation) Lorentz invariance. Wilson’s formulation, 
known as lattice gauge theory, is easy to understand, but the notation is a bit awkward, due 
to the lack of rotational invariance. Denote the location of the lattice sites by the vector x,-. 
On each link, say the one going from x, to one of its nearest neighbors X; , we associate an 
N by N simple unitary matrix Consider the square, known as a plaquette, bounded 
by the four corners x,-, x,-, x k , and x; (with these nearest neighbors to each other.) See 
figure VII.1.2. For each plaquette P we associate the quantity S(P) — Re tr U ',• •£/ 
constructed to be invariant under the local transformation 

Uii VrUijVj (14) 

The symmetry is local because for each site x,- we can associate an independent V). 
Wilson defined Yang-Mills theory by 

Z = J n dUe il/2f2) '£r S( - p) (15) 

where the sum is taken over all the plaquettes in the lattice. The coupling strength / 
controls how wildly the unitary matrices fluctuate. For small /, large values of S(P) 
are favored, and so the £/,•,•’s are all approximately equal to the unit matrix (up to an 
irrelevant global transformation.) 
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Figure VII.1.2 


Without doing any arithmetic, we can argue by symmetry that in the continuum limit 
a -> 0, Yang-Mills theory as we know it must emerge: The action is manifestly invariant 
under local SU(N) transformation. To actually see this, define a field A fl (x) with /x = 
1, 2, 3, 4, permeating the 4-dimensional Euclidean space the lattice lives in, by 

U {] - v}e iaA ^ x) Vj (16) 

where x — \ (x,- + x :) (namely the midpoint of the link lives on) and jtx is the direction 
connecting x,- to X; (namely /x = (xj — x t )/a is the unit vector in the [i direction.) The V’s 
just reflect the gauge freedom in (14) and obviously do not enter into the plaquette action 
S(P) by construction. I will let you show in an exercise that 

tr UyUjiUuUu = tr e '« 2 ^v+0(« 3 ) (17) 

with F /lv the Yang-Mills field strength evaluated at the center of the plaquette. Indeed, we 
could have discovered the Yang-Mills field strength in this way. I hope that you start to 
see the deep geometric significance of F flv . Continuing the exercise you will find that the 
action on each plaquette comes out to be 

S(P) = Ren e ia2F ^ +0(a,) 

= Re tr[l + + 0(a 5 )] = tr 1 - \a A tr + ■■■ 

(18) 

and so up to an irrelevant additive constant we recover in (15) the Yang-Mills action in the 
continuum limit. Again, it is worth emphasizing that without going through any arithmetic 
we could have fixed the a 4 term in (18) (up to an overall constant) by dimensional analysis 
and gauge invariance. 1 


1 The sign can be easily checked against the abelian case. 
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The Wilson formulation is beautiful in that none of the hand-wringing over gauge 
fixing, Faddeev-Popov determinant, ghost fields, and so forth is necessary for (15) to make 
sense. Recalling chapter V.3 you see that (15) defines a statistical mechanics problem 
like any other. Instead of integrating over some spin variables say, we integrate over the 
group SU(N ) for each link. Most importantly, the lattice gauge formulation opens up 
the possibility of computing the properties of a highly nontrivial quantum field theory 
numerically. Lattice gauge theory is a thriving area of research. For a challenge, try to 
incorporate fermions into lattice gauge theory: This is a difficult and ongoing problem 
because fermions and spinor fields are naturally associated with SO (4), which does not 
sit well on a lattice. 


Wilson loop 

Field theorists usually deal with local observables, that is, observables defined at a space- 
time point x, such as J /l (x) or tr F^ lv (x)F llv (x), but of course we can also deal with 
nonlocal observables, such as e §c dx ' lA i l in electromagnetism, where the line integral is 
evaluated over a closed curve C. The gauge invariant quantity in the exponential is equal 
to the electromagnetic flux going through the surface bounded by C. (Indeed, recall chap¬ 
ter IV.4.) 

Wilson pointed out that lattice gauge theory contains a natural gauge invariant but 
nonlocal observable W(C) = tr UjjUj k .. . U nm U mi , where the set of links connecting x t 
to X: to x k et cetera and eventually to x m and back to x i traces out a loop called C. Referring 
to (16) we see that W(C), known as the Wilson loop, is the trace of a product of many 
factors of e' aA ^. Thus, in the continuum limit a -> 0, we have evidently 

W(C) = tr (19) 

with C now an arbitrary curve in Euclidean spacetime. Here P denotes path ordering, 
clearly necessary since the A^’s associated with different segments of C, being matrices, 
do not commute with each other. [Indeed, P is defined by the lattice definition of W(C).] 
To understand the physical meaning of the Wilson loop, Recall chapters 1.4 and 1.5. To 
obtain the potential energy E between two oppositely charged lumps we have to compute 

]j m f jjA e 'StAaxweft(A)+i f cPxA^J 1 * _ ^—iET 

T^oo Z J 

For two lumps held at a distance R apart we plug in 

J^(x) = f/ l0 {<5 (3) (.r)— S (3) [x - (R, 0, 0)]) 

i ( f dx^ A — f dx^A ) 

and see that we are actually computing the expectation value (e Jc i ^ Jc 2 M } in 
a fluctuating electromagnetic field, where C 1 and C 2 denote two straight line segments 
at x = (0, 0, 0) and x — (R, 0, 0), respectively. It is convenient to imagine bringing the 
two lumps together in the far future (and similarly in the far past). Then we deal instead 
with the manifestly gauge invariant quantity (e $c dx A > 1 ) > where C is the rectangle shown 
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T 


Figure VII.1.3 


in figure VII.1.3. Note that for T large log^ §c dx A ' 1 ) -- iE(R)T, which is essentially 

proportional to the perimeter length of the rectangle C. 

As we will discuss in chapter VII.3 and as you have undoubtedly heard, the currently 
accepted theory of the strong interaction involves quarks coupled to a nonabelian Yang- 
Mills gauge potential A /r Thus, to determine the potential energy E(R) between a quark 
and an antiquark held fixed at a distance R from each other we “merely” have to compute 
the expectation value of the Wilson loop 


( W{C))= J n dUe~ a,2fl) ^p s> - P) W{C) 


( 20 ) 


In lattice gauge theory we could compute log(Vk(C)} for C the large rectangle in fig¬ 
ure VII.1.3 numerically, and extract E(R). (We lost the i because we are living in Euclidean 
spacetime for the purpose of this discussion.) 


Quark confinement 

You have also undoubtedly heard that since free quarks have not been observed, quarks are 
generally believed to be permanently confined. In particular, it is believed that the potential 
energy between a quark and an antiquark grows linearly with separation E(R) ~ oR. 
One imagines a string tying the quark to the antiquark with a string tension a. If this 
conjecture is correct, then log(W(C)} ~ oRT should go as the area RT enclosed by C. 
Wilson calls this behavior the area law, in contrast to the perimeter law characteristic of 
familiar theories such as electromagnetism. To prove the area law in Yang-Mills theory is 
one of the outstanding challenges of theoretical physics. 
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Exercises 


VI l.i .1 The gauge choice in the text preserves Lorentz invariance. It is often useful to choose a gauge that breaks 
Lorentz invariance, for example, /(A) = n^A^ix) with n some fixed 4-vector. This class of gauge choices, 
known as the axial gauge, contains various popular gauges, each of which corresponds to a particular 
choice of n. For instance, in light-cone gauge, n = (1, 0, 0, 1), in space-cone gauge, n = (0, 1, i , 0). Show 
that for any given A(x) we can find a gauge transformation so that n • A'{x) = 0. 

VI l.i .2 Derive (17) and relate / to the coupling g in the continuum formulation of Yang-Mills theory. [Hint: Use 
the Baker-Campbell-Hausdorff formula 

e A e B = e A+B+l[A,B]+j 1 ([A,[A,B]}+[B,[B,A]})+- 

VIl.i .3 Consider a lattice gauge theory in (D + 1)-dimensional space with the lattice spacing a in D -dimensional 
space and b in the extra dimension. Obtain the continuum D-dimensional field theory in the limit a — > 0 
with b kept fixed. 

VI l.i .4 Study in (2) the alternative limit b —> 0 with a kept fixed so that you obtain a theory on a spatial lattice 
but with continuous time. 

VI1 . 1.5 Show that for lattice gauge theory the Wilson area law holds in the limit of strong coupling. [Hint: Expand 
(20) in powers of f~ 2 .] 



Electroweak Unification 


VI 1.2 


The scourge of massless spin l particles 

With the benefit of hindsight, we now know that Nature likes Yang-Mills theory. In the 
late 1960s and early 1970s, the electromagnetic and weak interactions were unified into 
an electroweak interaction, described by a nonabelian gauge theory based on the group 
S U (2) ® U (1). Somewhat later, in the early 1970s, it was realized that the strong interaction 
can be described by a nonabelian gauge theory based on the group SU( 3). Nature literally 
consists of a web of interacting Yang-Mills fields. 

But when the theory was first proposed in 1954, it seemed to be totally inconsistent with 
observations as they were interpreted at that time. As Yang and Mills themselves pointed 
out in their paper, the theory contains massless spin 1 particles, which were certainly not 
known experimentally. Thus, except for interest on the part of a few theorists (Schwinger, 
Glashow, Bludman, and others) who found the mathematical structure elegantly attractive 
and felt that nonabelian gauge theory must somehow be relevant for the weak interaction, 
the theory gradually sank into oblivion and was not part of the standard graduate curricu¬ 
lum in particle physics in the 1960s. 

Again with the benefit of hindsight, it would seem that there are only two logical 
solutions to the difficulty that experimentalists do not see any massless spin 1 particles 
except for the photon: (1) the Yang-Mills particles somehow acquire mass, or (2) the Yang- 
Mills particles are in fact massless but are somehow not observed. We now know that the 
first possibility was realized in the electroweak interaction and the second in the strong 
interaction. 


Constructing the electroweak theory 

We now discuss electroweak unification. It is perhaps pedagogically clearest to motivate 
how we would go about constructing such a theory. As I have said before, this is not a 
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textbook on particle physics and I necessarily will have to keep the discussion of particle 
physics to the bare minimum. I gave you a brief introduction to the structure of the weak 
interaction in chapter IV.2. The other salient fact is that weak interaction violates parity, 
as mentioned in chapter II.l. In particular, the left handed electron field e L and the right 
handed electron field e R , which transform into each other under parity, enter into the weak 
interaction quite differently. 

Let us start with the weak decay of the muon, fi~ —> e~ + v + v', with v and 1 / the 
electron neutrino and muon neutrino, respectively. The relevant term in the Lagrangian 
is v' l y^h v l> with the left hand electron field e R , the electron neutrino field (which 
is left handed) v L , and so forth. The field [i L annihilates a muon, the field e L creates an 
electron, and so on. (Henceforth, we will suppress the word field.) As you probably know, 
the elementary constituents of matter form three families, with the first family consisting 
of v, e, and the up u and down d quarks, the second of v ', /x, and the charm c and strange 
s quarks, and so on. For our purposes here, we will restrict our attention to the first family. 
Thus, we start with VlY ij ' £ l^lYij, v l- 

As I remarked in chapter III.2, a Fermi interaction of this type can be generated by the 
exchange of an intermediate vector boson W + with the coupling W^v L y^ l e L + FV^e^y^v^. 

The idea is then to consider an S U (2) gauge theory with a triplet of gauge bosons denoted 
by W °, with a = 1,2,3. Put v L and e L into the doublet representation and the right handed 
electron field e R into a singlet representation, thus 



(The notation is such that the upper component of V 'r L is v L and the lower component is 
e L .) 

The fields v L and e L , but not e R , listen to the gauge bosons IV". Indeed, according to 
(IV.5.21) the Lagrangian contains 


Wft L r a Y^ L = {W 1 - i2 ^ L y +i2 Y ll fL+ hx.) + 

where FV^~' 2 = IV J — i W 2 and so forth. We recognize r 1+ ' 2 = r 1 + it 2 as the raising 
operator and the first two terms as (W^^ l2 \>i y ,l e L + h.c.), precisely what we want. By 
design, the exchange of generates the desired term 


We need more room 

We would hope that the boson W i we were forced to introduce would turn out to be the pho¬ 
ton so that electromagnetism is included. But alas, W i couples to the current = 

(y L y^ L v L — e^y^ei), not the electromagnetic current —ie L y^e L + e R y^e R ). Oops! 

Another problem lurks. To generate a mass term for the electron, we need a doublet 
Higgs field (p = 0 ) in order to construct the SU( 2) invariant term fir R (pe R in the 

Lagrangian so that when (p acquires the vacuum expectation value ( ° v ) we will have 
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/ 4 r L ( P e R f(V, e) L 


^ e R — f ve L e R 


( 2 ) 


But none of the SU ( 2) transformations leaves ( ° v )invariant: The vacuum expectation value 
of (p spontaneously breaks the entire SU (2) symmetry, leaving all three W bosons massive. 
There is no room for the photon in this failed theory. Aagh! 

We need more room. Remarkably, we can avoid both the oops and the aagh by extending 
the gauge symmetry to SU(2) <g> 17(1). Denoting the generator of 1/(1) by jY (called the 
hypercharge) and the associated gauge potential by B^ [and their counterparts T a and W° 
for SU (2)] we have the covariant derivative D ;l = 3^ — igW“T a — ig'B^j- With four gauge 
bosons, we dare to hope that one of them might turn out to be the photon. 

The gauge potentials are normalized by the corresponding kinetic energy terms, 
C — B^ v ) 2 — ?(W“ v ) 2 + • • • with the abelian B IJV = 3 /t Z?„ — 3 v B ll and nonabelian field 

strength W“ v = 3 /( W° — 3 + e ahc If* W ( v . The generators T a are of course normalized 

by the commutation relations that define 517(2). In contrast, there is no commutation 
relation in the abelian algebra 17(1) to fix the normalization of the generator \ Y. Until this 
is fixed, the normalization of the 17(1) gauge coupling g' is not fixed. 

How do we fix the normalization of the generator IF? By construction, we want spon¬ 
taneous symmetry breaking to leave a linear combination of T 3 and \ Y invariant, to be 
identified as the generator the massless photon couples to, namely the charge operator Q. 
Thus, we write 


<2 = r 3 +iy (3) 

Once we know and \Y of any field, this equation tells us its charge. For example, 
Q(v l ) — j + \Y(v l ) and Q(e L ) = — \ + \Y(e L ). In particular, we see that the coefficient 
of 7) in (3) must be 1 since the charges of v L and e L differ by 1. The relation (3) fixes the 
normalization of \ Y. 


Determining the hypercharge 

The next step is to determine the hypercharge of various multiplets in the theory, which in 
turn determines how B^ couples to these multiplets. Consider 1 /^. For e L to have charge 
— 1, the doublet 1 j/ L must have \ Y — — \. In contrast, the field e R has \ Y — — 1 since T$ — 0 
on e R . 

Given the hypercharge of \lr L and e R we see that the invariance of the term f^ R (pe R 
under 517(2) ® 17(1) forces the Higgs field tp to have \Y — +\. Thus, according to (3) 
the upper component of (p has electric charge Q — +2 + j — +1 and the lower component 
Q — — 2 + 2=0. Thus, we write <p — ( ^ Q ). Recall that <p has the vacuum expectation value 
( ° v ). The fact that the electrically neutral field (p° acquires a vacuum expectation value but 
the charged field q> + does not provide a consistency check. 
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The theory works itself out 


Now that the couplings of the gauge bosons to the various fields, in particular, the Higgs 
field, are determined, we can easily work out the mass spectrum of the gauge bosons, as 
indeed, let me remind you, you have already done in exercise IV.6.3! 

Upon spontaneous symmetry breaking (p -> (1/V2)( 0 ) (the normalization is conven¬ 
tional) : We simply plug in 


C = ■ 


ll_ w + w -^ + v _ {gW y g ' Bi f 


(4) 


I trust that this is what you got! Thus, the linear combination g W 3 — g'B^ becomes massive 
while the orthogonal combination remains massless and is identified with the photon. It 
is clearly convenient to define the angle 9 by tan 6 — g'/g. Then, 


= cos 9W 3 — sin 9B ]Jl (5) 

describes a massive gauge boson known as the Z boson, while the electromagnetic po¬ 
tential is given by A^ — sin 6 W 3 + cos 9B )JL . Combine (4) and (5) and verify that the mass 
squared of the Z boson is = v 2 (g 2 + g’ 2 )/ 4, and thus by elementary trigonometry ob¬ 
tain the relation 


M w = M z cos 6 (6) 

The exchange of the W boson generates the Fermi weak interaction 

g 2 _ 4 Q 

C = ~ TT7T ^ L y IJ e L ^LYii v L = ~-7=v L y^ l e L e L y^v L 
IM W V2 

where the second equality merely gives the historical definition of the Fermi coupling G. 
Thus, 

= - JL (7 ) 

c/2 8 M 2 w 

Next, we write the relevant piece of the covariant derivative 

= g(cos 6+ sin 9 A^)T 3 + g’{- sin + cos 9A^ Y - 

in terms of the physically observed Z and A. The coefficient of A^ works out to be 
g sin 9T 3 + g' cos 9{Y/ 2) = g sin 9{T 3 + Y/ 2); the fact that the combination Q — T 3 + 
Y /2 emerges provides a nice check on the formalism. Furthermore, we obtain 

e = g sin 6 (8) 

Meanwhile, it is convenient to write g cos 9T 3 — g' sin 9(Y /2), the coefficient of Z ;/ in 
the covariant derivative, in terms of the physically familiar electric charge Q rather than 
the theoretical hypercharge Y : Thus, 

g cos 9T 3 — g' sin9(Q-T i )= (T 3 - sin 2 9Q) 

cos 9 
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In other words, we have determined the coupling of the Z boson to an arbitrary fermion 
field in the theory: 

C= -sin 1 9 Q)^ (9) 

COS0 

For example, using (9) we can immediately write the coupling of Z to leptons: 

C = -J-Z[fo L y» VL - e L y»e L ) + sin 2 6ey»e] (10) 

cose 


Including quarks 


How to include the hadrons is now almost self evident. Given that only left handed 
fields participate in the weak interaction, we put the quarks of the first generation into 
SU (2) <g> U (1) multiplets as follows: 






( 11 ) 


where a — 1,2,3 denotes the color index, which I will discuss in the next chapter. The 
right handed quarks u a R and d R are put into singlets so that they do not hear the weak 
bosons W a . Recall that the up quark u and the down quark d have electric charges \ and 
— I respectively. Referring to (3) we see \ Y = \, \, and — \ for q “, i/" , and d R , respectively. 
From (9) we can immediately read off the coupling of the Z boson to the quarks: 


£ = —Z^lY^ul - d L y^d L ) - sin 2 ( 12 ) 

COS u 

Finally, I leave it to you to verify that of the four degrees of freedom contained in q> 
(since (p + and <p° are complex) three are eaten by the W and Z bosons, leaving one physical 
degree of freedom H corresponding to the elusive Higgs particle that experimenters are 
still searching for as of this writing. 


The neutral current 


By virtue of its elegantly economical gauge group structure, this S U (2) ® 1/(1) electroweak 
theory of Glashow, Salam, and Weinberg ushered in the last great predictive era of theo¬ 
retical particle physics. Writing (10) and (12) as 


r = I 7 (ji 1 jv- \ 

COS 0 Z ^ ,1 ' <J l ei ptons' quarks^ 

and using ( 6 ) we see that Z boson exchange generates a hitherto unknown neutral current 
interaction 


2-neutral current 2M 2 ^*^ e P^ ons (^leptons "F 2q Uar k s )^ 

between leptons and quarks. By studying various processes described by £ neu tral current we 
can determine the weak angle 0. Once 0 is determined, we can predict g from ( 8 ). Once g 
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is determined, we can predict M w from (7). Once M w is determined, we can predict M z 
from (6). 


Concluding remarks 

As I mentioned, there are three families of leptons and quarks in Nature, consisting 
of (v e , e, u, d), ( v /1 , / 1 , c, s ), and (v T , r, f, b). The appearance of this repetitive family 
structure, about which the SU (2) ® t/(l) theory has nothing to say, represents one of 
the great unsolved puzzles of particle physics. The three families, with the appropriate 
rotation angles between them, are simply incorporated into the theory by repeating what 
we wrote above. 

A more logical approach than the one given here would be to start with an SU (2) (g) U (1) 
theory with a doublet Higgs field with some hypercharge, and to say, “Behold, upon 
spontaneous symmetry breaking, one linear combination of generators remains unbroken 
with a corresponding massless gauge field.” I think that our quasi-historical approach is 
clearer. 

As I have mentioned on several occasions, Fermi’s theory of the weak interaction is 
nonrenormalizable. In 1999, ’t Hooft and Veltman were awarded the Nobel Prize for 
showing that the SU (2) ® U (1) electroweak theory is renormalizable, thus paving the way 
for the triumph of nonabelian gauge theories in describing the strong, electromagnetic, 
and weak interactions. I cannot go into the details of their proof here, but I would like 
to mention that the key is to start with the nonabelian analog of the unitary gauge (recall 
chapter IV.6) and proceed to the gauge. At large momenta, the massive gauge boson 
propagators go as ~ (k ll k v /k 2 ) in the unitary gauge, but as ~ (1 /k 2 ) in the gauge. The 
theory is then renormalizable by power counting. 


Exercises 


VI 1.2.1 Unfortunately, the mass of the elusive Higgs particle H depends on the parameters in the double well 
potential V = + X(<p^<p) 2 responsible for the spontaneous symmetry breaking. Assuming that 

H is massive enough to decay into W + + W~ and Z + Z, determine the rates for H to decay into various 
modes. 


VII. 2.2 Show that it is possible to stay with the SU (2) gauge group and to identify W 3 as the photon A, but at 
the cost of inventing some experimentally unobserved lepton fields. This theory does not describe our 
world: For one thing, it is essentially impossible to incorporate the quarks. Show this! [Hint: We have to 
put the leptons into a triplet of SU (2) instead of a doublet.] 



VI 1.3 


Quantum Chromodynamics 


Quarks 

Quarks come in six flavors, known as up, down, strange, charm, bottom, and top, denoted 
by u,d,s,c, b, and t. The proton, for example, is made of two up quarks and a down quark 
~ ( uud ), while the neutral pion corresponds to ~ (uu — dd)/\/2. Please consult any text 
on particle physics for details. 

By the late 1960s the notion of quarks was gaining wide acceptance, but two separate 
lines of evidence indicated that a crucial element was missing. In studying how hadrons 
are made of quarks, people realized that the wave function of the quarks in a nucleon does 
not come out to be antisymmetric under the interchange of any pair of quarks, as required 
by the Pauli exclusion principle. At around the same time, it was realized that in the ideal 
world we used to derive the Goldberger-Treiman relation and in which the pion is massless 
we can calculate the decay rate for the process u 0 —»• y + y, as mentioned in chapter IV.7. 
Puzzlingly enough, the calculated rate came out smaller than the observed rate by a factor 
of 9 = 3 2 . 

Both puzzles could be resolved in one stroke by having quarks carry a hitherto unknown 
internal degree of freedom that Gell-Mann called color. For any specified flavor, a quark 
comes in one of three colors. Thus, the up quark can be red, blue, or yellow. In a nucleon, 
the wave function of the three quarks will then contain a factor referring to color, besides 
the factors referring to orbital motion, spin, and so on. We merely have to make the color 
part of the wave function antisymmetric; in fact, we simply take it to be e a ^ v , where a, fi, 
and y denote the colors carried by the three quarks. With quarks in three colors, we have to 
multiply the amplitude for 7r° decay by a factor of 3, thus neatly resolving the discrepancy 
between theory and experiment. 
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Asymptotic freedom 

As I mentioned in chapter VI. 6 , the essential clue came from studying deep inelastic scat¬ 
tering of electrons off nucleons. Experimentalists made the intriguing discovery that when 
hit hard the quarks in the nucleons act as if they hardly interact with each other, in other 
words, as if they are free. On the other hand, since quarks are never seen as isolated enti¬ 
ties, they appear to be tightly bound to each other within the nucleon. As I have explained, 
this puzzling and apparently contradictory behavior of the quarks can be understood if the 
strong interaction coupling flows to zero in the large momentum (ultraviolet) limit and to 
infinity or at least to some large value in the small momentum (infrared) limit. A number 
of theorists proposed searching for theories whose couplings would flow to zero in the 
ultraviolet limit, now known as asymptotic free theories. Eventually, Gross, Wilczek, and 
Politzer discovered that Yang-Mills theory is asymptotically free. 

This result dovetails perfectly with the realization that quarks carry color. The nonabelian 
gauge transformation would take a quark of one color into a quark of another color. Thus, to 
write down the theory of the strong interaction we simply take the result of exercise IV.5.6, 

C = ~^ F Zv FaiiV + - m )? (!) 

with the covariant derivative = 3^ — iA^. The gauge group is SU (3) with the quark field 
q in the fundamental representation. In other words, the gauge fields A^ — A“ L T a , where 
T a [a = 1, . . ., 8) are traceless hermitean 3 by 3 matrices. Explicitly, [A^q) 01 — (T a )°^q^, 

where a, fi = 1, 2, 3. The theory is known as quantum chromodynamics, or QCD for short, 
and the nonabelian gauge bosons are known as gluons. To incorporate flavor, we simply 
write ^2j—i qj (i Y 11 ~ m j) c lj for the second term in ( 1 ), where the index j goes over the 
/ flavors. Note that quarks of different flavors have different masses. 

Infrared slavery 

The flip side of asymptotic freedom is infrared slavery. We cannot follow the renormaliza¬ 
tion group flow all the way down to the low momentum scale characteristic of the quarks 
bound inside hadrons since the coupling g becomes ever stronger and our perturbative cal¬ 
culation of d(g) is no longer adequate. Nevertheless, it is plausible although never proven 
that g goes to infinity and that the gluons keep the quarks and themselves in permanent 
confinement. The Wilson loop introduced in chapter VII.1 provides the order parameter 
for confinement. 

In elementary physics forces decrease with the separation between interacting objects, 
so permanent confinement is a rather bizarre concept. Are there any other instances of 
permanent confinement? 

Consider a magnetic monopole in a superconductor. We get to combine what we learned 
in chapters IV.4 and V.4 (and even VI.2)! A quantized amount of magnetic flux comes out 
of the monopole, but according to the Meissner effect a superconductor expels magnetic 
flux. Thus, a single magnetic monopole cannot live inside a superconductor. 
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Now consider an antimonopole a distance R away (figure VII.3.1). The magnetic flux 
coming out of the monopole can go into the antimonopole, forming a tube connecting 
the monopole and the antimonopole and obliging the superconductor to give up being a 
superconductor in the region of the flux tube. In the language of chapter V.4, it is no longer 
energetically favorable for the field or order parameter ip to be constant everywhere; instead 
it vanishes in the region of the flux tube. The energy cost of this arrangement evidently 
grows as R (consistent with Wilson’s area law). 

In other words, an experimentalist living inside a superconductor would find that 
it costs more and more energy to pull a monopole and an antimonopole apart. This 
confinement of monopoles inside a superconductor is often taken to be a model of the yet- 
to-be-proven confinement of quarks. Invoking electromagnetic duality we can imagine 
a magnetic superconductor in contrast to the usual electric superconductor. Inside a 
magnetic superconductor, electric charges would be permanently confined. Our universe 
may be likened to a color magnetic superconductor in which quarks (the analog of electric 
charges) are confined. 

On distance scales large compared to the radius of the color flux tube connecting a quark 
to an antiquark, the tube can be thought of as a string. Historically, that was how string 
theory originated. The challenge, boys and girls, is to prove that the ground state or vacuum 
of (1) is a color magnetic superconductor. 


Symmetries of the strong interaction 

Now that we have a theory of the strong interaction, we can understand the origin of the 
symmetries of the strong interaction, namely the isospin symmetry of Heisenberg and the 
chiral symmetry that when spontaneously broken leads to the appearance of the pion as a 
Nambu-Goldstone boson (as discussed in chapters IV.2 and VI .4). 

Consider a world with two flavors, which is all that is relevant for a discussion of the pion. 
Introduce the notation u = cj lt d = q 2 , and q = ( l ‘ d ) so that we can write the Lagrangian as 

C = -J^Kv Fa,lV + 9 - m)q 

with 


m = 


m u 0 
0 m d 
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where m u and m d are the masses of the up and down quarks, respectively. If m u — m d , 
the Lagrangian is invariant under q —»■ e'°' T q, corresponding to Heisenberg’s isospin 
symmetry. 

In the limit in which m u and m d vanish, the Lagrangian is invariant under q -> e ,,p ' Ty5 q, 
known as the chiral SU( 2) symmetry, chiral because the right handed quarks q R and the 
left handed quarks q L transform differently. To the extent that m u and m d are both much 
smaller than the energy scale of the strong interaction, chiral SU( 2) is an approximate 
symmetry. 

The pion is the Nambu-Goldstone boson associated with the spontaneous breaking 
of the chiral SU( 2). Indeed, this is an example of dynamical symmetry breaking since 
there is no elementary scalar field around to acquire a vacuum expectation. Instead, the 
strong interaction dynamics is supposed to drive the composite scalar fields uu and dd to 
“condense into the vacuum” so that (0| uu |0) = (0| dd |0> become nonvanishing, where the 
equality between the two vacuum expectation values ensures that Heisenberg’s isospin is 
not spontaneously broken, an experimental fact since there are no corresponding Nambu- 
Goldstone bosons. In terms of the doublet field q , the QCD vacuum is supposed to be such 
that (0| qq |0> ^ 0 while (0| qrq |0) = 0. 


Renormalization group flow 


The renormalization group flow of the QCD coupling is governed by 

with the all-crucial minus sign. Here 

T 2 (G)S ab — f acd f bcd 


( 2 ) 

(3) 


I will not go through the calculation of /3(g) here, but having mastered chapters VI.8 
and VII.1 you should feel that you can do it if you want to. 1 At the very least, you should 
understand the factor g 3 and T 2 (G ) by drawing the relevant Feynman diagrams. 

When fermions are included, 

Y t = fa) = [-T r 2(G) + \T 2 {F)\ ^- 2 (4) 

where 


T 2 (F)S ab = tr[T a (F)T b (F)] (5) 

I do expect you to derive (4) given (2). For SU(N ) T 2 (F) = \ for each fermion in the 
fundamental representation. Note that asymptotic freedom is lost when there are too many 
fermions. 


1 For a detailed calculation, see, e.g., S. Weinberg, The Quantum Theory of Fields, Vol. 2, sec. 18.7. 
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You already solved an equation like (4) in exercise VI.8.1. Let us define, in analogy to 
quantum electrodynamics, 01 $ (q ) = g (/x) 2 /4jt , the strong coupling at the momentum scale 
/i . From (4) we obtain 2 


«s(G) = 


1 + (l/4jr)(ll 


\n f )a s (li) log(G 2 //r 2 ) 


showing explicitly that a s (Q) -» 0 logarithmically as Q —> 00. 


( 6 ) 


Electron-positron annihilation 

I have space to show you only one physical application. Experimentalists have measured 
the cross section a of e + e~ annihilation into hadrons as a function of the total center-of- 
mass energy E. The amplitude is shown in figure VI 1.3.2. To calculate the cross section in 
terms of the amplitude, we have to go through what some people call “boring kinematics,” 
such as normalizing everything correctly, dividing by the flux of the two beams, and so 
forth (see the appendix to chapter II.6). For the good of your soul, you should certainly go 
through this type of calculation at least once. Believe me, I did it more times than I care 
to remember. But happily, as I will now show you, we can avoid most of this grunge labor. 
First, consider the ratio 

u, v(e + e~ hadrons) 
a(e + e -¥ fi + n ) 

The kinematic stuff cancels out. In figure VII.3.2 the half of the diagram involving the 
electron positron lines and the photon propagator also appears in the Feynman diagram 
e + e~ —> /u + q“ (figure VII.3.3) and so cancels out in R(E). The blob in figure VII. 3.2, which 
hides all the complexity of the strong interaction, is given by (0| J 1 ' (0) \h), where is the 

2 For the accumulated experimental evidence on a s (Q), see figure 14.3 in F. Wilczek, in: V. Fitch et al., eds., 
Critical Problems in Physics, p. 281. 
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electromagnetic current and the state | h) can contain any number of hadrons. To obtain 
the cross section we have to square the amplitude, include a S-function for momentum 
conservation, and sum over all | h), thus arriving at 

J2(2n) 4 S 4 (p h - p e+ - p e -)( 0| JH 0) \h)(h\ J\ 0) |0> (7) 

h 

[with q = p e + + p e - = ( E , 0)]. This quantity can be written as 

J d 4 xe iqx {0\ 7 M (x)7 v (0) |0) = J d 4 xe iqx {0\[J ,l (x), 7 l '(0)]|0) 

= 2 Im(i J d 4 xe iqx (0\TJ^ l (x)J v (0)\0)) 

(The first equality follows from E > 0 and the second was explained in chapter III.8.) 
To determine this quantity, we would have to calculate an infinite number of Feynman 
diagrams involving lots of quarks and gluons. A typical diagram is shown in figure VII.3.4. 
Completely hopeless! 

This is where asymptotic freedom rides to the rescue! From chapter VI.7 you learned 
that for a process at energy E the appropriate coupling strength to use is g(E). But as we 
crank up E, g(E ) gets smaller and smaller. Thus diagrams such as figure VII.3.4 involving 
many powers of g(E) all fall away, leaving us with the diagrams with no power of g(E) 
(fig. VII.3.5a) and two powers of g{E) (figs. VII.3.5b,c,d). No calculation is necessary to 
obtain the leading term in R(E), since the diagram in figure VII.3.5a is the same one 
that enters into e + e~ -» We merely replace the quark propagator by the muon 

propagator (quark and muon masses are negligible compared to E). At high energy, the 
quarks are free and R(E) merely counts the square of the charge Q a of the various quarks 
contributing at that energy. We predict 

( 8 ) 

£.—>03 A -' 


The factor of 3 accounts for color. 
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Not only does QCD turn itself off at high energies, it tells us how fast it is turning itself 
off. Thus, we can determine how the limit in (8) is approached: 

R(E)= ( 3 V qA (l + C--+ • • •) (9) 

\a / \ (11 — |n r ) logC^/jw.) / 

I will leave it to you to calculate C. 


Dreams of exact solubility 

An analytic solution of quantum chromodynamics is something of a “Holy Grail” for 
field theorists (a grail that now carries a prize of one million dollars: see www.ams.org/ 
claymath/). Many field theorists have dreamed that at least “pure” QCD, that is QCD 
without quarks, might be exactly soluble. After all, if any 4-dimensional quantum field 



Figure VII.3.5 
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Figure VII.3.6 


theory turns out to be exactly soluble, pure Yang-Mills, with all its fabulous symmetries, 
is the most likely possibility. (Perhaps an even more likely candidate for solubility is 
supersymmetric Yang-Mills theory. We will touch on supersymmetry in chapter VIII.4.) 

Let me be specific about what it means to solve QCD. Consider a world with only up 
and down quarks with m u and m d both set equal to zero, namely a world described by 

£ = -^^^ + ^>^<7 ( 10 ) 

The goal would be to calculate something like the ratio of the mass of the p meson m p to 
the mass of the proton m P . 

To make progress, theoretical physicists typically need to have a small parameter to 
expand in, but in trying to solve (10) we are confronted with the immediate difficulty 
that there is no such parameter. You might think that g is a parameter, but you would 
be mistaken. The renormalization group analysis taught us that g(p) is a function of the 
energy scale p at which it is measured. Thus, there is no particular dimensionless number 
we can point to and say that it measures the strength of QCD. Instead, the best we can do 
is to point to the value of /x at which (g(/x) 2 /47r) becomes of order 1. This is the energy, 
known as A qcd> at which the strong interaction becomes strong as we come down from 
high energy (fig. VII.3.6). But A q CD merely sets the scale against which other quantities 
are to be measured. In other words, if you manage to calculate m P it better come out 
proportional to A q CD since A q CD is the only quantity with dimension of mass around. 
Similarly for m p . Put in precise terms, if you publish a paper with a formula giving m p /m P 
in terms of pure numbers such as 2 and tc , the field theory community will hail you as a 
conquering hero who has solved QCD exactly. 

The apparent trade of a dimensionless coupling g for a dimensional mass scale A q CD 
is known as dimensional transmutation, of which we will see another example in the next 
chapter. 
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Exercises 

VI l- 3 -n Calculate C in (9). [Hint: If you need help, consult T. Appelquist and H. Georgi, Phys. Rev. D8:4000,1973; 
and A. Zee, Phys. Rev. D8: 4038, 1973.] 

VII. 3.2 Calculate (2). 



VI 1.4 


Large N Expansion 


Inventing an expansion parameter 

Quantum chromodynamics is a zero-parameter theory, so it is difficult to give even a first 
approximation. In desperation, field theorists invented a parameter in which to expand 
QCD. Suppose instead of three colors we have N colors, ’t Hooft 1 noticed that as N -> oo 
remarkable simplifications occur. The idea is that if we can calculate m p /m P , for example, 
in the large N limit the result may be close to the actual value. People sometimes joke that 
particle physicists regard 3 as a large number, but actually the correction to the large N 
limit is typically of order l/N 2 , about 10% in the real world. Particle physicists would be 
more than happy to be able to calculate hadron masses to this degree of accuracy. 

As with spontaneous symmetry breaking and a number of other important concepts, the 
large N expansion came out of condensed matter physics but nowadays is used routinely 
in all sorts of contexts. For example, people have tried a large N approach to solve high- 
temperature superconductivity and to fold RNA. 2 


Scaling the QCD coupling 

So, let the color group be U(N) and write 

C = tr F pv F^ + f[i{ $ - i, 1) - m]f (1) 

2 S 

Note that we have replaced g 2 by g 2 /N a . For finite N this change has no essential signifi¬ 
cance. The point is to choose the power a so that interesting simplifications occur in the 
limit N —> oo with g 2 held fixed. The cubic and quartic interaction vertices of the gluons 

1 G. ’t Hooft, Under the Spell of the Gauge Principle, p. 378. 

2 M. Bon, G. Vernizzi, H. Orland, and A. Zee, “Topological classification of RNA structures,” J. Mol. Biol. 
379:900, 2008. 
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Figure VII.4.1 


are proportional to N a . On the other hand, since the gluon propagator goes as the inverse 
of the quadratic terms in C, it is proportional to 1//V". The coupling of the gluon to the 
quark does not depend on N. 

To fix a, let us focus on a specific application, the calculation of a(e + e~ —>• hadrons) 
discussed in the last chapter. Suppose we want to calculate this cross section at low 
energies. Consider the two-gluon exchange diagrams shown in figures VII.4.la and b. 
The two diagrams are of order g 4 and we would have to calculate both. Note that lb is 
nonplanar: Since one gluon crosses over the other, the diagram cannot be drawn on the 
plane if we insist that lines cannot go through each other. 

Now the double-line formalism introduced in chapter IV.5 shines. In this formalism 
the diagrams figure VII.4.la and b are redrawn as in figure VII.4.2a and b. The two gluon 
propagators common to both diagrams give a factor 1/N 2a . Now comes the punchline. 
We sum over three independent color indices in 2a, thus getting a factor V 3 . Grab some 
crayons and try to color each line in 2a with a different color: you will need three crayons. 
In contrast, we sum over only one independent index in 2b, getting only a factor of N. In 
other words, 2a dominates 2b by a factor N 2 . In the large N limit we can throw 2b away. 

Clearly, the rule is to associate one factor of N with each loop. Thus, the lowest order 
diagram, shown in 2c, with N different colors circulating in it, scales as N; 2a scales as 
N 2 /N 2a . We want 2a and 2c to scale in the same way and thus we choose a — 1. 






Figure VII.4.2 
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By drawing more diagrams [e.g., 2d scales as A(1/A 4 )A 4 , with the three factors coming 
from the quartic coupling, the propagators, and the sum over colors, respectively], you can 
convince yourself that planar diagrams dominate in the large A limit, all scaling as A. For 
a challenge, try to prove it. Evidently, there is a topological flavor to all this. 

The reduction to planar diagrams is a vast simplification but there are still an infinite 
number of diagrams. At this stage in our mastery of field theory, we still can’t solve large 
A QCD. (As I started writing this book, there were tantalizing clues, based on insight and 
techniques developed in string theory, that a solution of large A QCD might be within 
sight. As I now go through the final revision, that hope has faded.) 

The double-line formalism has a natural interpretation. Group theoretically, the matrix 
gauge potential A', transforms just like q' q (but assuredly we are not saying that the gluon 
is a quark-antiquark bound state) and the two lines may be thought of as describing a quark 
and an antiquark propagating along, with the arrows showing the direction in which color 
is flowing. 


Random matrix theory 

There is a much simpler theory, structurally similar to large A QCD, that actually can be 
solved. I am referring to random matrix theory. 

Exaggerating a bit, we can say that quantum mechanics consists of writing down a 
matrix known as the Hamiltonian and then finding its eigenvalues and eigenvectors. In the 
early 1950s, when confronted with the problem of studying the properties of complicated 
atomic nuclei, Eugene Wigner proposed that instead of solving the true Hamiltonian in 
some dubious approximation we might generate large matrices randomly and study the 
distribution of the eigenvalues—a sort of statistical quantum mechanics. Random matrix 
theory has since become a rich and flourishing subject, with an enormous and growing 
literature and applications to numerous areas of theoretical physics and even to pure 
mathematics (such as operator algebra and number theory.) 3 It has obvious applications 
to disordered condensed matter systems and less obvious applications to random surfaces 
and hence even to string theory. Here I will content myself with showing how’t Hooft’s 
observation about planar diagrams works in the context of random matrix theory. 

Let us generate A by A hermitean matrices (p randomly according to the probability 

P(y>) = I e -AftrV(p) ( 2 ) 

with V ((p) a polynomial in (p. For example, let V (cp) = ^ m 2 (p 2 + g(p A . The normalization 
/ d(pP{pp ) = 1 fixes 

Z = f d<pe~ NtrVW (3) 

The limit N —> oo is always understood. 

3 For a glimpse of the mathematical literature, see D. Voiculescu, ed., Free Probability Theory. 
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As in chapter VI.7 we are interested in p(E), the density of eigenvalues of tp. To make 
sure that you understand what is actually meant, let me describe what we would do were 
we to evaluate p(E) numerically. For some large integer N, we would ask the computer to 
generate a hermitean matrix <p with the probability P {(p) and then to solve the eigenvalue 
equation (pv = E v. After this procedure had been repeated many times, the computer could 
plot the distribution of eigenvalues in a histogram that eventually approaches a smooth 
curve, called the density of eigenvalues p(E). 

We already developed the formalism to compute p(E) in (VI.7.1): Compute the real 

analytic function G(z) = ((1/77) tr[l/(z — (p)}) and p(E) = —(1/tt) lim Im G(E + ;'e).The 

£—> 0 

average (• • •) is taken with the probability P((p): 

m<P)) = | J D<pe- mrV WO(cp) 

You see that my choice of notation, cp for the matrix and V(cp) — \m 2 (p 2 + g(p 4 as an 
example, is meant to be provocative. The evaluation of Z is just like the evaluation of a path 
integral, but for an action S{q>) — N tr V(cp) that does not involve f d d x. Random matrix 
theory can be thought of as a quantum field theory in (0 + 0)-dimensional spacetime! 

Various field theoretic methods, such as Feynman diagrams, can all be applied to 
random matrix theory. But life is sweet in (0 + 0)-dimensional spacetime: There is no 
space, no time, no energy, and no momentum and hence no integral to do in evaluating 
Feynman diagrams. 


The Wigner semicircle law 


Let us see how this works for the simple case V(<p) — \m 2 (p 2 (we can always absorb in into 
q> but we won’t). Instead of G(z), it is slightly easier to calculate 

The last equality follows from invariance under unitary transformations: 

P(<p) = P{U^cpU) ( 4 ) 


Expand 


G'W = E 


n=0 


r 2 / 1+1 




Do the Gaussian integral 


(5) 

( 6 ) 


Setting k — I and summing, we find the n = 1 term in (5) is equal to (l/z 3 )<5'. (1/m 2 ). 

Just as in any field theory we can associate a Feynman diagram with each of the terms 
in (5). For the n — 1 term, we have figure VII.4.3. The matrix character of (p lends itself 
naturally to’t Hooft’s double-line formalism and thus we can speak of quark and gluon 
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Figure VII.4.3 


propagators with a good deal of ease. The Feynman rules are given in figure VII.4.4. We 
recognize <p as the gluon field and (5) as the gluon propagator. Indeed, we can formulate 
our problem as follows: Given the bare quark propagator 1/z, compute the true quark 
propagator G(z) with all interaction effects taken into account. 

Let us now look at the n — 2 term in (5) 1/z 5 < >> which we represent in 

figure VII.4.5a. With a bit of thought you can see that the index i can be contracted with 
k, 1, or j, thus giving rise to figures VII.4.5b, c, d. Summing over color indices, just as in 
QCD, we see that the planar diagrams in 5b and 5d dominate the diagram in 5c by a factor 
N 2 . We can take over’t Hooff s observation that planar diagrams dominate. 

Incidentally, in this example, you see how large N is essential, allowing us to get rid of 
nonplanar diagrams. After all, if I ask you to calculate the density of eigenvalues for say 
N — 7 you would of course protest saying that the general formula for solving a degree-7 
polynomial equation is not even known. 

The simple example in figure VII.4.5 already indicates how all possible diagrams could 
be constructed. In 5b the same “unit” is repeated, while in 5d the same “unit” is nested 
inside a more basic diagram. A more complicated example is shown in 5e. You can convince 
yourself that for N = oo all diagrams contributing to G(z) can be generated by either 
“nesting” existing diagrams inside an overarching gluon propagator or “repeating” an 
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Figure VII.4.4 
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(a) 




(c) 




(e) 


Figure VII.4.5 


existing structure over and over again. Translate the preceding sentence into two equations: 
“Repeat” (see figure VII.4.6a), 


G(z) = 1 + - S (z) - + -E(z)-E(z)- + 
z z z z z z 


z-S(z) 

and “nest” (see figure VII.4.6b), 


(7) 


E(z)= — G(z) 


( 8 ) 
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Figure VII.4.6 


Combining these two equations we obtain a simple quadratic equation for G(z) that we 
can immediately solve to obtain 



(From the definition of G(z) we see that G(z) —> 1/z for large z and thus we choose the 
negative root.) We immediately deduce that 

p(E)=-Ja 1 -E 2 (10) 

7 ra A 

where a 2 = 4/m 2 . This is a famous result known as Wigner’s semicircle law. 


The Dyson gas 

I hope that you are struck by the elegance of the large N planar diagram approach. But you 
might have also noticed that the gluons do not interact. It is as if we have solved quantum 
electrodynamics while we have to solve quantum chromodynamics. What if we have to 
deal with V(q>) — \m 2 (p 2 + g(p 4 ? The g(p A term causes the gluons to interact with each 
other, generating horrible diagrams such as the one in figure VII.4.7. Clearly, diagrams 
proliferate and as far as I know nobody has ever been able to calculate G(z) using the 
Feynman diagram approach. 

Happily, G (z) can be evaluated using another method known as the Dyson gas approach. 
The key is to write 


<p = U Jf AU 


( 11 ) 
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where A denotes the A by A diagonal matrix with diagonal elements equal to A,-, i = 
l, N. Change the integration variable in (3) from ip to U and A: 

Z = j dU j (n idX t ) Je~ N ^i‘ Vat) (12) 

with J the Jacobian. Since the integrand does not depend on U we can throw away the 
integral over U. It just gives the volume of the group SU (A). Does this remind you of 
chapter VII. 1? Indeed, in (11) U corresponds to the unphysical gauge degrees of freedom— 
the relevant degrees of freedom are the eigenvalues {A, }. As an exercise you can use the 
Faddeev-Popov method to calculate J. 

Instead, we will follow the more elegant tack of determining J by arguing from general 
principles. The change of integration variables in (11) is ill defined when any two of the A,’s 
are equal, at which point J must vanish. (Recall that the change from Cartesian coordinates 
to spherical coordinates is ill defined at the north and south poles and indeed the Jacobian 
in sin QdOdcp vanishes at 6 — 0 and it.) Since the A,’s are created equal, interchange 
symmetry dictates that / = [FI m> „(A m — A„)]^. The power p can be fixed by dimensional 
analysis. With A 2 matrix elements dip obviously has dimension X N while (TljdX^J has 
dimension X N X^ N ^ N ~ 1 ^ 2 ; thus P = 2. 

Having determined J , let us rewrite (12) as 

r , -w£v(V) 

z = J (n,dAi)[n m>n (A m - AJ] 2 e * 

= f ( n i dX i )e~ N ^ V ^ log “ 2 (13) 

Dyson pointed out that in this form Z — f (n j dX i )e~ NE ^ 1 ’ i s Just the partition 
function of a classical 1-dimensional gas (recall chapter V.2). Think of A,-, a real number, 
as the position of the /'th molecule. The energy of a configuration 

E(M ,..., X N ) = J2 V(X k ) - ^ E - X n) 2 (14) 

k m^n 

consists of two terms with obvious physical interpretations. The gas is confined in a 
potential well V(x) and the molecules repel 4 each other with the two-body potential 
— (1/A) log(.x — y) 2 . Note that the two terms in E are of the same order in A since each 


4 Note that this corresponds to the repulsion between energy levels in quantum mechanics. 
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sum counts for a power of N. In the large N limit (we can think of N as the inverse 
temperature), we evaluate Z by steepest descent and minimize E , obtaining 



(15) 


which in the continuum limit, as the poles in (15) merge into a cut, becomes V'(k) = 
2V f dfi[p(fi)/(k — pi)], where p{pi) is the unknown function we want to solve for and V 
denotes principal value. 

Defining as before G(z) = f dp[p(p)/(z — pi)] we see that our equation for p(p) can be 
written as Re G(X + is) — \ V'(X). In other words, G(z) is a real analytic function with cuts 
along the real axis. We are given the real part of G(z) on the cut and are to solve for the 
imaginary part. Brezin, Itzykson, Parisi, and Zuber have given an elegant solution of this 
problem. Assume for simplicity that V (z) is an even polynomial and that there is only one 
cut (see exercise VII.4.7). Invoke symmetry and, incorporating what we know, postulate 
the form 


G(z) = \ [v\z) - P(zWz 2 - a 2 ] 

with P(z) an unknown even polynomial. Remarkably, the requirement G(z) -> 1/z for 
large z completely determines P (z). Pedagogically, it is clearest to go to a specific example, 
say V(z) — jin 2 z 2 + gz 4 . Since V'(z) is a cubic polynomial in z, P(z) has to be a quadratic 
(even) polynomial in z. Taking the limit z —> oo and requiring the coefficients of z 3 and 
of z in G(z) to vanish and the coefficient of 1/z to be 1 gives us three equations for three 
unknowns [namely a and the two unknowns in P(z)\. The density of eigenvalues is then 
determined to be p(E) — (1/tt )P(E)\/a 2 — E 2 . 

I think the lesson to take away here is that Feynman diagrams, in spite of their historical 
importance in quantum electrodynamics and their usefulness in helping us visualize what 
is going on, are vastly overrated. Surely, nobody imagines that QCD, even large N QCD, 
will one day be solved by summing Feynman diagrams. What is needed is the analog of 
the Dyson gas approach for large N QCD. Conversely, if a reader of this book manages to 
calculate G(z) by summing planar diagrams (after all, the answer is known!), the insight 
he or she gains might conceivably be useful in seeing how to deal with planar diagrams 
in large N QCD. 


Field theories in the large N limit 


A number of field theories have also been solved in the large N expansion. I will tell you 
about one example, the Gross-Neveu model, partly because it has some of the flavor of 
QCD. The model is defined by 


-/■ 


= / drx 





(16) 


Recall from chapter III.3 that this theory should be renormalizable in (1 + 1 (-dimensional 
spacetime. For some finite N , say N — 3, this theory certainly appears no easier to solve 
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than any other fully interacting field theory. But as we will see, as N —> 00 we can extract 
a lot of interesting physics. 

Using the identity (A. 14) we can rewrite the theory as 


SW,(X) = 


/ 


d 2 x 


^2 fait? -(^)fa - —; g2 
La=l 2 S 


(17) 


By introducing the scalar field a (x) we have “undone” the four-fermion interaction. (Recall 
that we used the same trick in chapter III.5.) You will note that the physics involved is 
similar to that behind the introduction of the weak boson to generate the Fermi interaction. 
Using what we learned in chapters II.5 and IV.3 we can immediately integrate out the 
fermion fields to obtain an action written purely in terms of the a field 


S(a) = — [ d 1 x-^—-o 1 

J 2 g 2 


iN tr log(zjl — cr) 


(18) 


Note the factor of N in front of the tr log term coming from the integration over N fermion 
fields. With the malice of forethought we, or rather Gross and Neveu, have introduced an 
explicit factor of 1 /N in the coupling strength in (16), so that the two terms in (18) both scale 
as N. Thus, the path integral Z = f D<7e' S(a) may be evaluated by the steepest descent or 
stationary phase method in the large N limit. We simply extremize S(c r). 

Incidentally, we can see the judiciousness of the choice a — 1 in large N QCD in the 
same way. Integrating out the quarks in (1) we get 


S= — [ d A x-^— tr F„ u F /iv + N tr log(z(jl — i/d) — m) 

J 2g 2 

and thus the two terms both scale as N and can balance each other. The increase in the 
number of degrees of freedom has to be offset by a weakening of the coupling. 

To study the ground state behavior of the theory, we restrict our attention to field 
configurations cr(x) that do not depend on x. (In other words, we are not expecting 
translation symmetry to be spontaneously broken.) We can immediately take over the result 
you got in exercise IV.3.3 and write the effective potential 


-V(<r)= 

N 2g([i) 2 


a 2 + 



(19) 


We have imposed the condition (l/AO[d 2 V(er)/dcr 2 ]| a=/i = l/g(/x) 2 as the definition of 
the mass scale dependent coupling g(/x) (compare IV.3.18). The statement that V(er) is 
independent of /x immediately gives 


1 n 

= - log — 

- li' 


( 20 ) 


g(fi) 2 g(ii') 2 

As /x -* 00 , g(/x) -» 0. Remarkably, this theory is asymptotically free, just like QCD. If we 
want to, we can work backward to find the flow equation 


d 1 , 

R—g(/x) = — —g(/x) +■ 

dll ATT 


( 21 ) 


The theory in its different incarnations, (16), (17), and (18), enjoys a discrete Z 2 symme¬ 
try under which \j/ a —y y 5 i/ r fl and a -* —a. As in chapter IV.3, this symmetry is dynamically 
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broken by quantum fluctuations. The minimum of V(a) occurs at cr min = fze 1 n ! g ^ 2 and 
so according to (17) the fermions acquire a mass 


ffl F — °min — 


l-n/gdi ) 1 


( 22 ) 


Note that this highly nontrivial result can hardly be seen by staring at (16) and we have no 
way of proving it for finite N. In the spirit of the large N approach, however, we expect that 
the fermion mass might be given by m F = /ze 1-7r /kbO 2 _|_ 0(1/N 2 ) so that (22) would be a 
decent approximation even for say, N = 3. Since m F is physically measurable, it better not 
depend on ji. You can check that. 

This theory also exhibits dimensional transmutation as described in the previous chap¬ 
ter. We start out with a theory with a dimensionless coupling g and end up with a dimen¬ 
sional fermion mass m F . Indeed, any other quantity with dimension of mass would have 
to be equal to m F times a pure number. 


Dynamically generated kinks 

I discuss the existence of kinks and solitons in chapter V.6. You clearly understood that the 
existence of such objects follows from general considerations of symmetry and topology, 
rather than from detailed dynamics. Here we have a (1 + 1)-dimensional theory with a 
discrete Z 2 symmetry, so we certainly expect a kink, namely a time independent configu¬ 
ration cr(x) (henceforth x will denote only the spatial coordinate and will no longer label 
a generic point in spacetime) such that cr(—oo) = — cr min and cr(+oo) = a min . [Obviously, 
there is also the antikink with cr(—oo) = cr min and ct(+oo) = —er m i n .] 

At first sight, it would seem almost impossible to determine the precise shape of the 
kink. In principle, we have to evaluate tr log[; — a Or)] for an arbitrary function a (x) such 
that er(+oo) = — er(—oo) (and as I explained in chapter IV.3, this involves finding all the 
eigenvalues of the operator ijt — cr(x), summing over the logarithm of the eigenvalues), 
and then varying this functional of cr(x) to find the optimal shape of the kink. 

Remarkably, the shape can actually be determined thanks to a clever observation. 5 In 
analogy with the steps leading to (IV.3.24) we note that 

tr log[^ - a(x)\ = tr log y 5 [ifl - a(x)]y 5 = tr log(-l)^ + 

and thus up to an irrelevant additive constant 

tr log(x ^ — cr(x)) = \ tr log[i^ — o(x)][ijl + cr(.r)] 

= \ tr log j-3 2 + iy 1 a'(x) - [ff(.r)] 2 J (23) 


5 C. Callan, S. Coleman, D. Gross, and A. Zee, (unpublished). See D. J. Gross, “Applications of the Renormal¬ 
ization Group to High-Energy Physics," in: R. Balian and J. Zinn-Justin, eds., Methods in Field Theory, p. 247. By 
the way, I recommend this book to students of field theory. 
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Since y 1 has eigenvalues ±;, this is equal to 

\ {tr log{ —3 2 + a'(x) - [crU)] 2 } + tr log{-3 2 - o'(x) - [cr( jt)] 2 } } 
but these two terms are equal by parity (space reflection) and hence 
tr log[i^ — cr(x)] = tr log( —3 2 — a'(x) — [cr( a - )] 2 } 

Referring to (18) we see that S(o) is the sum of two terms, a term quadratic in <j(x) 
and a term that depends only on the combination cr'(x) + [cr(x)] 2 . But we know that <7 min 
minimizes S(<r). Thus, the soliton is given by the solution of the ordinary differential 
equation 

a'(x) + [ ff (x)] 2 = a 2 in (24) 

namely cr(x) = er min tanh a min x. The soliton would be observed as an object of size 
l/°min = 1 /nip. I leave it to you to show that its mass is given by 

N .... 

tn s = — m F (25) 

n 

Precisely as theorized in the last chapter, the ratio m$/ m F comes out to be a pure number, 
N/it, as it must. 

By an even more clever method that I do not have space to describe, Dashen, Hasslacher, 
and Neveu were able to study time dependent configurations of a and determine the mass 
spectrum of this model. 


Exercises 


VI 1.4.1 Since the number of gluons only differs by one, it is generally argued that it does not make any difference 
whether we choose to study the U ( N ) theory or the SU ( N ) theory. Discuss how the gluon propagator in 
a U(N) theory differs from the gluon propagator in an SU (N) theory and decide which one is easier. 

VII.4.2 As a challenge, solve large N QCD in (1 + 1)-dimensional spacetime. [Hint: The key is that in (1 + 1)- 
dimensional spacetime with a suitable gauge choice we can integrate out the gauge potential A jjL .\ For 
help, see't Hooft, Under the Spell of the Gauge Principle, p. 443. 

VI I.4.3 Show that if we had chosen to calculate G(z) = ((1/AO tr(l/ z — <p))> we would have to connect the two 
open ends of the quark propagator. We see that figures VI1.4.5b and d lead to the same diagram. Complete 
the calculation of G(z) in this way. 


VII.4.4 Suppose the random matrix (p is real symmetric rather than hermitean. Show that the Feynman rules 
are more complicated. Calculate the density of eigenvalues. [Hint: The double-line propagator can twist.] 


VII.4.5 For hermitean random matrices (p , calculate 


G c (z, w) = 


1 1 1 

— tr- 

N z — (p N 


tr 


w — (p 


1 1 

— tr- 

N z — (p 


1 1 

— tr- 

N w — <p 


for V h p ) = jtn 2 (p 2 using Feynman diagrams. [Note that this is a much simpler object to study than the 
object we need to study in order to learn about localization (see exercise VI.6.1).] Show that by taking 
suitable imaginary parts we can extract the correlation of the density of eigenvalues with itself. For help, 
see E. Brezin and A. Zee, Phys. Rev. E51: p. 5442, 1995. 
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VII. 4.6 Use the Faddeev-Popov method to calculate J in the Dyson gas approach. 

VII. 4.7 For V((p) = jm 2 (p 2 + g(p 4 , determine p(E). For m 2 sufficiently negative (the double well potential again) 
we expect the density of eigenvalues to split into two pieces. This is evident from the Dyson gas picture. 
Find the critical value m 2 . For m 2 < m 2 the assumption of G(z) having only one cut used in the text fails. 
Show how to calculate p(E) in this regime. 

VI I. 4.8 Calculate the mass of the soliton (25). 



Grand Unification 


VI 1.5 


Crying out for unification 

A gauge theory is specified by a group and the representations the matter fields belong 
to. Let us go back to chapter VII.2 and make a catalogue for the SU(3) ® SU{2) <g> U (1) 
theory. For example, the left handed up and down quarks are in a doublet (ja) L with 
hypercharge \ Y = 1. Let us denote this by (3, 2, \) L , with the three numbers indicating 
how these fields transform under SU (3) ® SU(2) ® U (1). Similarly, the right handed up 
quark is (3, 1, f)^. The leptons are (1, 2, —\)i and (1, 1, —l) s , where the “1” in the first 
entry indicates that these fields do not participate in the strong interaction. Writing it all 
down, we see that the quarks and leptons of each family are placed in 

(3, 2, \) L , (3, 1, §)*, (3, 1, -\) R , (1, 2 ,-\) L , and (1, 1, -l) s (1) 

This motley collection of representations practically cries out for further unification. 
Who would have constructed the universe by throwing this strange looking list down? 

What we would like to have is a larger gauge group G containing SU (3) 0 SU (2) <g> 
U (1), such that this laundry list of representations is unified into (ideally) one great big 
representation. The gauge bosons in G [but not in SU (3) <g> SU (2) <g> U (1) of course] would 
couple the representations in (1) to each other. 

Before we start searching for G, note that since gauge transformations commute with 
the Lorentz group, these desired gauge transformations cannot change left handed fields 
to right handed fields. So let us change all the fields in (1) to left handed fields. Recall from 
exercise II.1.9 that charge conjugation changes left handed fields to right handed fields 
and vice versa. Thus, instead of (1) we can write 

(3, 2, \), (3*, 1, -§), (3*. 1, ]), (1, 2, -1), and (1, 1, 1) (2) 

We now omit the subscripts L and R: everybody is left handed. 
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A perfect fit 

The smallest group that contains SU( 3) <g> SU (2) <g> U( 1) is SU( 5). (If you are shaky 
about group theory, study appendix B now.) Recall that SU (5) has 5 2 — 1 = 24 generators. 
Explicitly, the generators are represented by 5 by 5 hermitean traceless matrices acting on 
five objects we denote by if/ 11 with n = 1, 2,..., 5. [These five objects form the fundamental 
or defining representation of SU (5).] 

It is now obvious how we can fit SU (3) and SU (2) into SU (5). Of the 24 matrices that 
generate SU( 5), eight have the form ( q q ) and three the form ( ® ® ), where A represents 
3 by 3 hermitean traceless matrices (of which there are 3 2 — 1 = 8, the so-called Gell- 
Mann matrices) and B represents 2 by 2 hermitean traceless matrices (of which there 
are 2 2 — 1 = 3, namely the Pauli matrices). Clearly, the former generate an SU( 3) and the 
latter an SU (2). Furthermore, the 5 by 5 hermitean traceless matrix 

/0 0 0 0\ 

0 -i 0 0 0 

\Y = 0 0 0 0 (3) 

0 0 0 \ 0 

0 0 0 0 

generates a U( 1). Without being coy about it, we have already called this matrix the 
hypercharge \ Y. 

In other words, if we separate the index /x = {a, i } with a — 1,2,3 and i — 4, 5, then 
the SU(3) acts on the index a and the SU( 2) acts on the index i. Thus, the three objects 
xj/ a transform as a 3-dimensional representation under SU{3) and hence could be a 3 
or a 3*. Let us choose xjr a as transforming as 3; we will see shortly that this is the right 
choice with 7/2 given as in (3). The three objects x// 01 do not transform under SU(2) 
and hence each of them belongs to the singlet 1 representation. Furthermore, they carry 
hypercharge —| as we can read off from (3). To sum up, i jr a transform as (3, 1, —|) 
under SU (3) ® SUil) ® U (1). On the other hand, the two objects xj/' transform as 1 under 
SU (3) and 2 under SU( 2), and carry hypercharge \\ thus they transform as (1, 2, In 
other words, we embed SU(3) <E> SU(2) <g> 1/(1) into SU (5) by specifying how the defining 
representation of SU (5) decomposes into representations of SU (3) <g> SU (2) <g> U (1) 

5^(3,1, —1)©(1, 2,1) (4) 

Talcing the conjugate we see that 

5*^ (3*,1, i) 0(1, 2,-1) (5) 

Inspecting (2), we see that (3*, 1, |) and (1, 2, — |) appear on the list. We are on the right 
track! The fields in these two representations fit snugly into 5*. 

This accounts for five of the fields contained in (2); we still have the ten fields 

(3, 2, |), (3*, 1, —and (1, 1, 1) 


( 6 ) 
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Consider the next representation of SU ( 5) in order of size, namely the antisymmetric 
tensor representation \[r llv . Its dimension is (5 x 4)/2 = 10, precisely the number we want, 
if only the quantum numbers under SU (3) <g> SU(2) < 8 > U(l) work out! 

Since we know that 5 -* (3, 1, — |) © (1, 2, J), we simply (again, see appendix B!) have to 
work out the antisymmetric product of (3, 1, — 3 ) © (1, 2, J) with itself, namely the direct 
sum of (where <g> A denotes the antisymmetric product) 

(3,1, — 3 ) (3,1, — 5 ) = (3*, 1, — |) (7) 

(3, 1, -1) ® A (1, 2, \) = (3, 2,+ 1) = (3, 2, l) ( 8 ) 


and 


(1, 2, 1) ® A (1, 2, i) = (1, 1, 1) (9) 

[I will walk you through (7): In SU (3) 3 ® A 3=3* (remember e r;i from appendix B?), in 
SU (2) 1 1=1, and in U (1) the hypercharges simply add — ^ | = — §.] 

Lo and behold, these SU(3) <g> SU(2) <g> 1/(1) representations form exactly the collection 
of representations in (6). In other words, 

10 -»■ (3, 2, g) ® (3*, 1, —§) ® (1,1,1) (10) 


The known quark and lepton fields in a given family fit perfectly into the 5* and 10 
representations of SU (5)! 

I have just described the SU (5) grand unified theory of Georgi and Glashow. In spite of 
the fact that the theory has not been directly verified by experiment, it is extremely difficult 
for me and for many other physicists not to believe that SU (5) is at least structurally correct, 
in view of the perfect group theoretic fit. 

It is often convenient to display the contents of the representation 5* and 10, using the 
names given to the various fields historically. We write 5* as a column vector 


Vv = 



v 

V e J 


( 11 ) 


and the 10 as an antisymmetric matrix 


V^ V = {V^, f ai , 

f' J ) 




f 0 u 

—u 

d 

u \ 


—ii 0 

u 

d 

u 

= 

u —u 

0 

d 

u 


-d -d 

—d 

0 

e 


\ —u —u 

—u 

—e 

0 / 


(I suppressed the color indices.) 
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Deepening our understanding of physics 

Aside from its esthetic appeal, grand unification deepens our understanding of physics 
enormously. 

1. Ever wondered why electric charge is quantized? Why don’t we see particles with 
charge equal to +Jn times the electron’s charge? In quantum electrodynamics, you could 
perfectly well write down 

C = — ifi) — m]\j/ + — i\fn A) — + • • • (13) 

In contrast, in grand unified theory A /( couples to a generator of the grand unifying 
gauge group, and you know that the generators of any group such as SU(N) (that is 
not given by the direct product of U (1) with other groups) are forced by the nontrivial 
commutation relations [T a , T b ] = if abc T c to assume quantized values. For example, the 
eigenvalues of T 3 in SU( 2), which depend on the representation of course, must be 
multiples of \. Within SU{ 3) x SU{2) x U(l), we cannot understand charge quantization: 
The generator of U (1) is not quantized. But upon grand unification into SU (5) [or more 
generally any group without 17(1) factors] electric charge is quantized. 

The result here is deeply connected to Dirac’s remark (chapter IV.4) that electric charge is 
quantized if the magnetic monopole exists. We know from chapter V.7 that spontaneously 
broken nonabelian gauge theories such as the SU (5) theory contain the monopole. 

2. Ever wondered why the proton charge is exactly equal and opposite to the electron 
charge? This important fact allows us to construct the universe as we know it. Atoms must 
be electrically neutral to some fantastic degree of accuracy for standard cosmology to work; 
otherwise, electrostatic forces between macroscopic matter would tear the universe apart. 

This remarkable fact is nicely incorporated into SU(S). It is fun to see how it goes. 
Evaluating tr Q = 0 over the 5* implies that 3 Qj = —Q e ~. I have used the fact that the 
strong interaction commutes with electromagnetism and hence quarks with different color 
have the same charge. Now let us calculate the proton charge Qp\ 


Qp — 2Q U + Qd — 2(Qd + 1) + Qd — 3Qd + 2 — Q e - + 2 


(14) 


If Q e ~ — — 1, then Q P = — Q e -, as is indeed the case! 

3. Recall that in electroweak theory we defined tan 6 — g\/g 2 , with the coupling of the 
gauge bosons g 2 ^T a + g\B p (Y/ 2). Since the normalization of A" and B p is fixed by 
their respective kinetic energy term, the relative strength of g 2 and is determined by 
the normalization of Y /2 relative to T 3 . Let us evaluate tr 77 2 and tr(T/2) 2 on the defining 
representation 5 : tr T 2 = (I) 2 + (I) 2 = \ and tr(F/2) 2 = (I) 2 3 + ( 2 ) 2 2 = |. 

Thus, T 3 and A /3/5(F/2) are normalized equally. So the correct grand unified combina¬ 
tion is A“ T a + B py /3/S(Y /2), and therefore tan 6 — g\/g 2 — ^3/5 or 

i 3 
sin 2 6 = - 
8 


( 15 ) 
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at the grand unification scale. To compare with the experimental value of sin 2 0 we would 
have to study how the couplings g 2 and g 1 flow under the renormalization group down to 
low energies. We will postpone this discussion until the next chapter. 

Freedom from anomaly 

Recall from chapter VI1.2 that the key to proving renormalizability of nonabelian gauge 
theory is the ability to pass freely between the unitary gauge and the R> gauge. The 
crucial ingredient is gauge invariance and the resulting Ward-Takahashi identities (see 
chapter II.7). 

Suddenly you start to worry. What about the chiral anomaly? The existence of the 
anomaly means that some Ward-Takahashi identities fail to hold. For our theories to make 
sense, they had better be free from anomalies. I remarked in chapter IV.7 that the historical 
name “anomaly” makes it sound like some kind of sickness. Well, in a way, it is. 

We should have already checked the SU(3) <g> SU (2) <g> U (1) theory for anomalies, but 
we didn’t. I will let you do it as an exercise. Here I will show that the SU (5) theory is healthy. 
If the S U (5) theory is anomaly-free, then a fortiori so is the St/ (3) <g> SU (2) <g> U (1) theory. 

In chapter IV.7 I computed the anomaly in an abelian theory but as I remarked there 
clearly all we have to do to generalize to a nonabelian theory is to insert a generator T a 
of the gauge group at each vertex of the triangle diagram in figure IV.7.1. Summing over 
the various fermions running around the loop, we see that the anomaly is proportional 
to A abc (R ) = tr(T a {T b , T c }), where R denotes the representation to which the fermions 
belong. We have to sum A abc (R) over all the representations in the theory, remembering 
to associate opposite signs to left handed and right handed fermion fields. (It may be 
helpful to remind yourself of remark 3 in chapter IV.7 and exercise IV.7.6.) 

We are now ready to give the SU (5) theory a health check. First, all fermion fields in (2) 
are left handed. Second, convince yourself (simply imagine calculating A abc for all possible 
abc) that it suffices to set T a , T b , and T c all equal to 

/2 0 0 0 0 

0 2 0 0 0 

T = 0 0 2 0 0 

000-3 0 

\0 0 0 0-3 

a multiple of the hypercharge. Let us now evaluate tr T 3 on the 5* representation, 

tr r 3 | 5 , = 3(-2) 3 + 2(+3) 3 = 30 (16) 

and on the 10, 


tr T 3 | 10 = 3(+4) 3 + 6(—l) 3 + (—6) 3 = -30 
An apparent miracle! The anomaly cancels. 


( 17 ) 
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This remarkable cancellation between sums of cubes of a strange list of numbers 
suggests strongly, to say the least, that SU (5) is not the end of the story. Besides, it would 
be nice if the 5* and 10 could be unified into a single representation. 


Exercises 


VII. 5 -i Write down the charge operator Q acting on 5, the defining representation i/hfi Work out the charge 
content of the 10 = and identify the various fields contained therein. 

VI1. 5.2 Show that for any grand unified theory, as long as it is based on a simple group, we have at the unification 
scale 

V T 2 

sin 2 6 = =-A; (18) 

E2 2 

where the sum is taken over all fermions. 

VII. 5.3 Check that the 517(3) (g> SU(2) 0 U (1) theory is anomaly-free. [Hint: The calculation is more involved 
than in SU (5) since there are more independent generators. First show that you only have to evaluate 
tr Y{T a , T b ) and tr F 3 , with T a and Y the generators of SU (2) and U(l), respectively.] 

VI I. 5.4 Construct grand unified theories based on SU ( 6 ), SU (7), SU ( 8 ), ..., until you get tired of the game. 
People used to get tenure doing this. [Hint: You would have to invent fermions yet to be experimentally 
discovered.] 



VI 1.6 


Protons Are Not Forever 


Proton decay 

Charge conservation guarantees the stability of the electron, but what about the stability of 
the proton? Charge conservation allows p -* 7T° + e + . No fundamental principle says that 
the proton lives forever, but yet the proton is known for its longevity: It has been around 
essentially since the universe began. 

The stability of the proton had to be decreed by an authority figure: Eugene Wigner was 
the first to proclaim the law of baryon number conservation. The story goes that when 
Wigner was asked how he knew that the proton lives forever he quipped, “I can feel it in 
my bones.” I take the remark to mean that just from the fact that we do not glow in the 
dark we can set a fairly good lower bound on the proton’s life span. 

As soon as we start grand unifying, we better start worrying. Generically, when we grand 
unify we put quarks and leptons into the same representation of some gauge group [see 
(VII.5.11 and VII.5.12)]. This miscegenation immediately implies that there are gauge 
bosons transforming quarks into leptons and vice versa. The bag of three quarks known 
as the proton could very well get turned into leptons upon the exchange of these gauge 
bosons. In other words, the proton, the rock on which our world is built upon, may not be 
forever! Thus, grand unification runs the risk of being immediately falsified. 

Let M x denote generically the masses of those gauge bosons transforming quarks into 
leptons and vice versa. Then the amplitude for proton decay is of order g 2 /M 2 , with g 
the coupling strength of the grand unifying gauge group, and the proton decay rate F is 
given by (g 2 /M^) 2 times a phase space factor controlled essentially by the proton mass 
m P since the pion and positron masses are negligible compared to the proton mass. By 
dimensional analysis, we determine that T ~ (g 2 /M x ) 2 m^ p . Since the proton is known to 
live for something like at least 10 31 years, M x had better be huge compared to the kind of 
energy scales we can reach experimentally. 

The mass M x is of the same order as the mass scale M GUT at which the grand unified 
theory is spontaneously broken down to SU (3) ® SU (2) ® (7(1). Specifically, in the 5(7(5) 
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Figure VII.6.1 


theory, a Higgs field transforming as the adjoint 24, with its vacuum expectation value 
(H^) equal to the diagonal matrix with elements (— |, — — ±, ±) times some v, can 

do the job, as was discussed in chapter IV.6. The gauge bosons in SU (3) <g> SU (2) <g> U (1) 
remain massless while the other gauge bosons acquire mass M x of order gv. 

To determine M GUT , we apply renormalization group flow to g 3 , g 2 , and g lt the cou¬ 
plings of SU( 3), SU(2), and 1/(1), respectively. The idea is that as we move up in the 
mass or energy scale /x the two asymptotically free couplings g 3 (/x) and g 2 (b) decrease 
while g\(n) increases. Thus, at some mass scale M GUT they will meet and that is where 
SU (3) ® SU (2) ® U(l) is unified into SU (5) (see figure VII.6.1). Because of the extremely 
slow logarithmic running (it should be called walking or even crawling but again for his¬ 
torical reasons we are stuck with running) of the coupling constant, we anticipate that the 
unification mass scale M GU t will come out to be much larger than any scale we were used 
to in particle physics prior to grand unification. In fact, M GUT will turn out to have an 
enormous value of the order 10 14 ~ 15 Gev and the idea of grand unification passes its first 
hurdle. 


Stability of the world implies the weakness of electromagnetism 


Using the result of exercise VI.8.1 we obtain (here a s = gj/4-n and a GUT = g 2 /4n denote 
the strong interaction and grand unification analog of the fine structure constant a, 
respectively, with F the number of families) 


An 


[g 3 ( aO] 2 
An 

[ftdM )] 2 

3 An 


ct s (n) 
sin 2 9 ( fx ) 
a(/x) 

3 cos 2 9(n) 
5 [Si(M)] 2 5 a(n) 


+ J- ( 4F-33)log^yi 
“GUT 6 ?r M 


M, 


-+ — (AF-22) log 

“gut M 

= — + — 4Flog^™I 

“GUT 6?r H 


GUT 


( 1 ) 


( 2 ) 


( 3 ) 
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By 9 (fi) we mean the value of 6 at the scale fi. At n — M GUT , the three couplings are related- 
through SU (5). 

We evaluate these equations for some experimentally accessible value of /x, plugging in 
measured values of a s and a. With three equations, we not only manage to determine the 
unification scale M GUT and coupling a GUT , but we can predict 9. In other words, unless 
the ratio gi to g 2 is precisely right, the three lines in figure VII.6.1 will not meet at one 
point. 

Notethatthe number of fermion families F contributes equally to (1), (2), and (3).This is 
as it should be since the fermions are effectively massless for the purpose of this calculation 
and do not “know” that the unifying group has been broken into SU ( 3) <8> SU (2) ® U (1). 
These equations are derived assuming that all fermion masses are small compared to /x. 

Rearranging these equations somewhat, we find 


sim 9 = - + 


5 


6 9a s (/x) 

— = — + — lllog ^ui 
a s (/x) 6 n fi 


a (AO 

1 

a(/x) 3 ffcLu 6?r V 3 


8 1 


+ 7T ( - 22 ) lo g 


M r 


(4) 

(5) 

( 6 ) 


We obtain in (4) a prediction for sin 2 9 (/x) independent of M GU t and of the number of 
families. 

Note that (5) gives the bound 


— > — lllog fer 
a(n) 6 n fi 


(7) 


A lower bound on the proton lifetime (and hence on M GU t) translates into an upper bound 
on the fine structure constant. Amusingly, the stability of the world implies the weakness 
of electromagnetism. 

As I noted earlier, plugging in the measured value of a s , we obtain a huge value for 
M G ut- I regard this as a triumph of grand unification: M GUT could have come out to have 
a much lower scale, leading to an immediate contradiction with the observed stability of the 
proton, but it didn’t. Another way of looking at it is that if we are somehow given M GU t and 
a GUT> grand unification fixes the couplings of all three nongravitational interactions! The 
point is not that this simplest try at grand unification doesn’t quite agree with experiment: 
The miracle is that it works at all. 

It is beyond the scope of this book to discuss in detail the comparison of (4), (5), and 
(6) with experiment. To do serious phenomenology, one has to include threshold effects 
(see exercise VII.6.1), higher order corrections, and so on. To make a long story short, 
after grand unified theory came out there was enormous excitement over the possibility 
of proton decay. Alas, the experimental lower bound on the proton lifetime was eventually 
pushed above the prediction. This certainly does not mean the demise of the notion of 
grand unification. Indeed, as I mentioned earlier, the perfect fit is enough to convince 
most particle theorists of the essential correctness of the idea. Over the years people have 
proposed adding various hypothetical particles to the theory to promote proton longevity. 
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The idea is that these particles would affect the renormalization group flow and hence 
Mqjjj. The proton lifetime is actually not the most critical issue. With more accurate 
measurements of a s and of 6, it was found that the three couplings do not quite meet at 
a point. Indeed, for believers in low energy supersymmetry, part of their faith is founded 
on the fact that with supersymmetric particles included, the three coupling constants do 
meet. 1 But skeptics of course can point to the extra freedom to maneuver. 


Branching ratios 

You may have realized that (1), (2), and (3) are not specific for SU (5): they hold as long 
as SU (3) ® SU (2) <g> U (1) is unified into some simple group (simple so that there is only 
one gauge coupling g). 

Let us now focus on SU( 5). Recall that we decompose the SU (5) index /x, which can 
take on five values, into two types. In other words, the index [x is labeled by {a, i\, where a 
takes on three values and i takes on two values. The gauge bosons in SU (5) correspond to 
the 24 independent components of the traceless hermitean field A p (/x, v = 1, 2, .. ., 5) 
transforming as the adjoint representation. Focusing on the group theory of SU (5), we 
will suppress Lorentz indices, spinor indices, etc. Clearly, the eight gauge bosons in SU (3) 
transform an index of type a into an index of type a , while the three gauge bosons in SU (2) 
transform an index of type i into an index of type i. Then there is the U (1) gauge boson 
that couples to the hypercharge \ Y. (Of course, you know what I mean by my somewhat 
loose language: The SU(3) gauge bosons transform fields carrying a color index into a 
field carrying a color index.) 

The fun comes with the gauge bosons A" and A' , which transform the index a into 
the index i and vice versa. Since a takes on three values, and i takes on two values, there 
are 6 + 6 = 12 such gauge bosons, thus accounting for all the gauge bosons in SU (5). In 
other words, 24 —> (8,1) + (1, 3) + (1,1) + (3, 2) + (3*, 2). We will now see explicitly that 
the exchange of these bosons between quarks and leptons leads to proton decay. 

We merely have to write down the terms in the Lagrangian involving the coupling of 
the bosons A" and A' to fermions and draw the appropriate Feynman diagrams. I will 
go through part of the group theoretic analysis, leaving you to work out the rest. Simply 
by contracting indices we see that the boson A p acting on i/q, takes it to i fr v and acting on 
x/f vp takes it to i jr w . Let us look at what A^ does, using your result from exercise VII.5.1. 
It takes 

fs = e~ = d 

i jr 01 ^ = u —>■ = u 

and 

i/r“ 4 = ^ f 54 = £>+ (10) 


( 8 ) 

(9) 


1 See, e.g., F. Wilczek, in: V. Fitch et al., eds., Critical Problems in Physics, p. 297. 
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Figure VII.6.2 


Thus, the exchange of A s a generates the process (figure VII.6.2) u + d -* u + e + , leading to 
proton decay p(uud) -* tx°( uii) + e + . Observe that while the decay p —> tt° + e + violates 
both baryon number B and lepton number L, it conserves the combination B — L. 

In exercise VII.6.2 you will work out the branching ratios for various decay modes. Too 
bad experimentalists have not yet measured them. 

Fermion masses 

We might hope that with grand unification we would gain new understanding of quark 
and lepton masses. Unfortunately, the situation on fermion masses in SU (5) is muddled, 
and to this day nobody understands the origin of quark and lepton masses. 

Introducing a Higgs field <p M transforming as the 5 (as indicated by the notation) we can 
write the coupling 

^cr iv <p v (ii) 

and 

V v cV P V a s llvXpa (12) 

(with <p v the conjugate 5*), reflecting the group theoretic fact (see appendix C) that 5* 0 10 
contains the 5 and 10®10 contains the 5*. 

Since 5 -» (3, 1, — 5 ) ® (1,2, |) we see that this Higgs field is just the natural extension 
of the SU (2) 0 (7(1) Higgs doublet (1, 2, j). Not wanting to break electromagnetism, we 
allow only the electrically neutral fourth component of <p to acquire a vacuum expectation 
value. Setting (ip 4 ) = v, we obtain (up to uninteresting overall constants) 

+ x/f 5 Ci/f 54 ==> m d = m e (13) 

and 


ijr ttP Cf yS e a py => m u ^ 0 


(14) 
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The larger symmetry yields a mass relation m d — m e at the unification scale; we again 
have to apply the renormalization group flow. It is worth noting that the mass relation 
m d = m e comes about because as far as the fermions are concerned, SU (5) has been only 
broken down to St/(4) by <p. The trouble is that we obtain more or less the same relation 
for each of the three families, since most of the running occurs between the unification 
scale Mquj and the top quark mass so that threshold effects give only a small correction. 
Putting in numbers one gets something like 


mb 

m r 


—1 ~ 3 


(15) 


Let us use this to predict the down sector quark masses in terms of the lepton masses. 
The formula m b ~ 3m T works rather well and provides indirect evidence that there can 
only be three families since the renormalization group flow depends on F. The formula 
m s ~ 3mis more or less in the ballpark, depending on what “experimental” value one 
takes for m s . The formula for m d , on the other hand, is downright embarrassing. People 
mumble something about the first family being so light and hence other effects, such as 
one-loop corrections might be important. At the cost of making the theory uglier, people 
also concoct various schemes by introducing more Higgs fields, such as the 45, to give 
mass to fermions. 

Note that in one respect SU (5) is not as “economical” as SU (2) ® U (1), in which the 
same Higgs field that gives mass to the gauge bosons also gives mass to the fermions. 


The universe is not empty, but almost 

I mention in passing another triumph of grand unification: its ability to explain the origin 
of matter in our universe. It has long behooved physicists to understand two fundamental 
facts about the universe: (1) the universe is not empty, and (2) the universe is almost empty. 
To physicists, (1) means that the universe is not symmetric between matter and antimatter, 
that is, the net baryon number N B is nonzero; and (2) is quantified by the strikingly small 
observed value N B /N y ~ 10 -10 of the ratio of the number of baryons to the number of 
photons. 

Suppose we start with a universe with equal quantities of matter and antimatter. For the 
universe to evolve into the observed matter dominated universe, three conditions must be 
satisfied: (1) The laws of the universe must be asymmetric between matter and antimatter. 

(2) The relevant physical processes had to be out of equilibrium so that there was an arrow 
of time. (3) Baryon number must be violated. 

We know for a fact that conditions (1) and (2) indeed hold in the world: There is 
CP violation in the weak interaction and the early universe expanded rapidly. As for 

(3) , grand unification naturally violates baryon number. Furthermore, while proton decay 
(suppressed by a factor of 1/A/^ UT in amplitude) proceeds at an agonizingly slow rate (for 
those involved in the proton decay experiment!), in the early universe, when the X and 
Y bosons are produced in abundance, their fast decays could easily drive baryon number 
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violation. The suppression factor 1/M^ UT does not come in. I have no doubt that eventually 
the number 10 -10 measuring “the amount of dirt in the universe” will be calculated in 
some grand unified theory. 

Hierarchy 

I promised you that the Weisslcopf phenomenon would come back to haunt us. That the 
grand unification mass scale M GUT naturally comes out so large counts as a triumph, 
but it also leads to a problem known as the hierarchy problem. The hierarchy refers to 
the enormous ratio M GUT /M EW , where M EW denotes the electroweak unification scale, of 
order 10 2 Gev. I will sketch this rather murky subject. Look at the Higgs field <p respon¬ 
sible for breaking electroweak theory. We don’t know its renormalized or physical mass 
precisely, but we do know that it is of order M EW . Imagine calculating the bare pertur¬ 
bation series in some grand unified theory—the precise theory does not enter into the 
discussion—starting with some bare mass /z 0 for <p. The Weisslcopf phenomenon tells 
us that quantum correction shifts by a huge quadratically cutoff dependent amount 
S/jLq ~ f 2 A 2 ~ f 2 M^ vr , where we have substituted for A the only natural mass scale 
around, namely M GUT , and where / denotes some dimensionless coupling. To have the 
physical mass squared fi 2 = fi% + 8fi q come out to be of order Mg W , something like 28 or¬ 
ders of magnitude smaller than A/ GUT , would require an extremely fine-tuned and highly 
unnatural cancellation between and S/zjj. How this could happen “naturally” poses a 
severe challenge to theoretical physicists. 


Naturalness 

The hierarchy problem is closely connected with the notion of naturalness dear to the the¬ 
oretical physics community. We naturally expect that dimensionless ratios of parameters 
in our theories should be of order unity, where the phrase “order unity” is interpreted lib¬ 
erally between friends, say anywhere from 10 -2 or 10 -3 to 10 2 or 10 3 . Following’t Hooft, 
we can formulate a technical definition of naturalness: The smallness of a dimensionless 
parameter r) would be considered natural only if a symmetry emerges in the limit rj —> 0. 
Thus, fermion masses could be naturally small, since, as you will recall from chapter II.1, 
a chiral symmetry emerges when a fermion mass is set equal to zero. On the other hand, 
no particular symmetry emerges when we set either the bare or renormalized mass of a 
scalar field equal to zero. This represents the essence of the hierarchy problem. 


Exercises 


VI 1.6.1 Suppose there are F' new families of quarks and leptons with masses of order M'. Adopting the crude 
approximation described in exercise VI. 8 .2 of ignoring these families for /z below M' and of treating M' 
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as negligible for /x above M', run the renormalization group flow and discuss how various predictions, 
such as proton lifetime, are changed. 


VI 1 . 6.2 Work out proton decay in detail. Derive relations between the following decay rates: F(p —> jT°e + ), 
F(p —> 7 T + v), F(n —> n~e + ), and F(n ^ 

VII. 6.3 Show that SU (5) conserves the combination B — L. For a challenge, invent a grand unified theory that 
violates B — L. 



VI 1.7 


SO(io) Unification 


Each family into a single representation 

At the end of chapter VII.5 we felt we had good reason to think that SU (5) unification is 
not the end of the story. Let us ask if we might be able to fit the 5 and 10* into a single 
representation of a bigger group G containing S U (5). 

It turns out that there is a natural embedding of SU( 5) into the orthogonal >S(?(10) 
that works, 1 but to explain that I have to teach you some group theory. The starting point 
is perhaps somewhat surprising: We go back to chapter II.3, where we learned that the 
Lorentz group SO(3, 1), or its Euclidean cousin SO( 4), has spinor representations. We 
will now generalize the concept of spinors to d-dimensional Euclidean space. I will work 
out the details for cl even and leave the odd dimensions as an exercise for you. You might 
also want to review appendix B now. 


Clifford algebra and spinor representations 

Start with an assertion. For any integer n we claim that we can find 2 n hermitean matrices 
Yi (i — 1, 2, • • •, 2 n) that satisfy the Clifford algebra 

[y i ,y J } = 2S i j (1) 

In other words, to prove our claim we have to produce 2 n hermitean matrices Yi that 
anticommute with each other and square to the identity matrix. We will refer to the /,•’s as 
the y matrices for SO(2n). 

For n — 1, it is a breeze: Yi — tj and y 2 — r 2 . There you are. 


1 Howard Georgi told me that he actually found SO (10) before SU( 5). 
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Now iterate. Given the 2n y matrices for SO(2n) we construct the (In + 2) y matrices 
for 5 O (2 n + 2) as follows 


(n+1) in) „ 

Yj — Yj ®G = 


y? 0 

0 - 7 


(») 


j = 1, 2, • • • , 2n 


(n+1) 1 _ 

y 2 „+i = l®*i = 


0 1 
1 0 


= 1 ® = ( . Q 


( 2 ) 

(3) 

(4) 


(Throughout this book 1 denotes a unit matrix of the appropriate size.) The superscript in 
parentheses is obviously for us to keep track of which set of y matrices we are talking about. 
Verify that if the y ( ”^s satisfy the Clifford algebra, the y (,1+1) ’s do as well. For example, 

{yf +l \ Y&Xi 5 = { y) n) ® u) • (! ® T i) + (! ® h) ■ (Yj n) ® G) 

= Yj n) ® {r 3 , Tj} = 0 

This iterative construction yields for SO(2n) the y matrices 


Y 2 k -1 = l0l0---0l®T 1 ®r 3 ®T 3 0---0T 3 


(5) 


and 


Y2k — 1 ® 1 ® • ■ ■ 0 1 0 t 2 ® t 3 ® t 3 ® ■ • ■ 0 r 3 (6) 

with 1 appearing k — 1 times and r 3 appearing n — k times. The y’s are evidently 2" by 2" 
matrices. When and if you feel confused at any point in this discussion you should work 
things out explicitly for 5(9(4), 5(9(6), and so on. 

In analogy with the Lorentz group, we define 2n(2n — l)/2 = n(2n — 1) hermitean 
matrices 

°y s \\yi’ Yj\ 

Note that rr (/ - is equal to i y,- y ,■ for i ^ j and vanishes for i 
with each other is thus easy to work out. For example, 

[a 12 , 023 ] — — [yiy2> Y 2 Y 1 ] — — yiy2y2/3 + y2y3yiy2 — —[vi> 73 ] — 2(o- 13 

Roughly speaking, the y 2 ’s in ct 12 and ct 2 3 knock each other out. Thus, you see that the 
j(T,-, ’s satisfy the same commutation relations as the generators J IJ ’s of SO(2n) (as given 
in appendix B). The 2 er,-/s represent the J lJ ’s. 

As 2" by 2" matrices, the er’s act on an object i// with 2” components that we will call the 
spinor i (r. Consider the unitary transformation i// —> e l0>i i a ‘i\fr with (Ojj = a set of real 
numbers. Then 

^Ykf -► y k e l< 0 , i a, j x/f = f^'y k f - : |o, ; , yji/r + • • • 

for a>ij infinitesimal. Using the Clifford algebra we easily evaluate the commutator as 
[cr,-,-, y A ] = —2 i(8 ik yj — S^y). (If A: is not equal to either i or j then y k clearly commutes 


( 7 ) 


= j. The commutation of the er’s 
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with Oj j , and if k is equal to either i or j, then we use y 2 = 1.) We see that the set of objects 
v k = \lr~ f y k xlr, k = 1, • • •, 2n transforms as a vector in 2n -dimensional space, with 4&>,-,■ the 
infinitesimal rotation angle in the i j plane: 

^ v k - 2 ((OkjVj - co ik Vj) - v k - Aai kj Vj ( 8 ) 

(in complete analogy to 1 jry^ifr transforming as a vector under the Lorentz group.) This 
gives an alternative proof that \ <Tj; represents the generators of 50(2;;). 

We define the matrix y FIVF = (—0 "KiK2 '' ' Yin’ which in the basis we are using has the 
explicit form 

y FIVE = t 3 <g> r 3 <8> • • • ® t 3 (9) 

with r 3 appearing ;; times. By analogy with the Lorentz group we define the “left handed” 
spinor -*jr L = 2(1 — y FIVE ) \[r and the “right handed” spinor 1 //^ = ±(1 + y FIVE )i/;, such that 
y FIVE i/r L = —and y FIVE i/; fl = Under 1 Jr —> e ,a>i i ai iifr, we have ^r L —> e ,a>i j a ‘i\lr L and 

1 lr R —> since y FIVE commutes with cr,-.-. The projection into left and right handed 

spinors cut the number of components into halves and thus we arrive at the important 
conclusion that the two irreducible spinor representations of S O (2 n ) have dimension 2" “ 1 . 
(Convince yourself that the representation cannot be reduced further.) In particular, the 
spinor representation of 50(10) is 2 10//2-1 = 2 4 = 16—dimensional. We will see that the 
5* and 10 of SU (5) can be fit into the 16 of 50(10). 


Embedding unitary groups into orthogonal groups 

The unitary group SU (5) can be naturally embedded into the orthogonal group 50(10). 
In fact, I will now show you that embedding SU (n) into SO(2n) is as easy as z = x + iy. 

Consider the 2n -dimensional real vectors x — (x 1 , • • • , x n , y k , ■ ■ ■ , y n ) and 
x’ = (Xj, • • • , x' , y' v ■ ■ ■, y'). By definition, SO(2n) consists of linear transformations 
on these two real vectors leaving their scalar product x'x — ■ + y'-Yj) invariant. 

Now out of these two real vectors we can construct two n -dimensional complex vectors 
z = (x 1 + iy 1 , ■ ■ ■, x n + iy n ) and z' — (x\ + iy' v • • •, x' n + iy' ). The group U(n) consists of 
transformations on the two n -dimensional complex vectors z and z! leaving invariant their 
scalar product 

n 

( z ')* z = + iy'j)*(Xj + iyj) 

i=i 

n n 

= + v / v / > +' - y'i x ? 

1=1 1=1 

In other words, 50(2;;) leaves i x 'j x j + > F ->’/) invariant, but U (n) consists of the 
subset of those transformations in 50(2;;) that leave invariant not only ^" =1 (x'xy + y'.yj ) 
but also T, n j=i( x jyj ~ Vj x j)- 

Now that we understand this natural embedding of U(n) into 50(2;;), we see that the 
defining or vector representation of SO(2n), which we will call simply 2;;, decomposes 
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upon restriction to U ( n ) into the two defining representations of U(n), n and n*; thus 

2n —> n ® n* ( 10 ) 

In other words, (xj, • • •, x n , y 1( ■ ■ ■, y n ) can be written as (xj + iyi, ■ ■ ■, x n + iy n ) and 
(xj — iyi, • • ■ ,x n — iy„) Note that this is the analog of (VII.5.4) indicating that the defining 
representation of 50(5) decomposes into representations of 50(3) <g> 50(2) ® O(1): 

5 -*■ (3*, 1, 3 ) ® (1, 2 , —|). (11) 

Given the decomposition law (10), we can now figure out how other representations of 
5 0(2n) decompose when restricted to the natural subgroup U(n). The tensor representa¬ 
tions of 5 O (2 n ) are easy, since they are constructed out of the vector representation. [This is 
precisely what we didin going from (VII.5.4) to (VII. 5.7, 8 , and 9).] For example, the adjoint 
representation of SO (In ), which has dimension In (In — \)/2 — n (2n — 1), transforms as 
an antisymmetric 2 -index tensor 2 n ® A 2 n and so decomposes into 

2 n ® A In -* ( n ® «*) ® A (n ® n *) (12) 

according to (10). The antisymmetric product ® A on the right hand side is, of course, to 
be evaluated within U(n). For instance, n <® A n is the n(n — l)/2 representation of U(n). 
In this way, we see that 

n(2n — 1 ) -*■ n 2 — 1 (the adjoint) 

® 1 (the singlet) 
ffi n (n — 1)/2 
® (n(n - 11 / 2 )* 

As a check, the total dimension of the representations of U (n) on the right hand side adds 
up to (n 2 — 1) + 1 + 2 n(n — l)/2 = n(2n — 1). In particular, for .SCHIO) D SU (5), we have 
45-» 24 ©1©10© 10* and of course 24 + 1 + 10 + 10 = 45. 

Decomposing the spinor 

It is more difficult to figure out how the spinor representation of SO(2n) decompose 
upon restriction to U(n). I give here a heuristic argument that satisfies most physicists, 
but certainly not mathematicians. I will just do ,50(10) D SU (5) and let you work out 
the general case. The question is how the 16 falls apart. Just from numerology and from 
knowing the dimensions of the smaller representations of SU (5) (1, 5,10,15) we see there 
are only so many possibilities, some of them rather unlikely, for example, the 16 falling 
apart into 16 l’s. 

Picture the spinor 16 of 50(10) breaking up into a bunch of representations of SU (5). 
By definition, the 45 generators of 50 (10) scramble all these representations together. Let 
us ask what the various pieces of 45, namely 24 ©1 © 10 © 10*, do to these representations. 

The 24 transform each of the representations of 5 U (5) into itself, of course, because they 
are the 24 generators of 50(5) and that is what generators were born to do. The generator 
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1 can only multiply each of these representations by a real number. (In other words, the 
corresponding group element multiplies each of the representations by a phase factor.) 

What does the 10, which as you recall from chapter VII.5 is represented as an antisym¬ 
metric tensor with two upper indices and hence also known as [2], do to these represen¬ 
tations? Suppose the bunch of representations that 5 breaks up into contains the singlet 
[0] = lof 5(7(5). The 10 = [2] acting on [0] gives the [2] = 10. (Almost too obvious for words! 
An antisymmetric tensor of two indices combined with a tensor with no indices is an anti¬ 
symmetric tensor of two indices.) What about 10 = [2] acting on [2]? The result is a tensor 
with four upper indices. It certainly contains the [4], which is equivalent to [1]* = 5*. But 
look, 1 © 10 © 5* already add up to 16. Thus, we have accounted for everybody. There can’t 
be more. So we conclude 

5+ [0] © [2] 0 [4] = 1 0 10 0 5* (14) 

The 5* and the 10 of 5(7(5) fit inside the 16 + of 50(10)! 

We will learn later that the two spinor representations of 50(10) are conjugate to each 
other. Indeed, you may have noticed that I snuclc a superscript plus on the letter 5. The 
conjugate spinor 5“ breaks up into the conjugate of the representations in (14): 

S~ [1] 0 [3] 0 [5] = 5 0 10* 0 1* (15) 

The long lost antineutrino 

The fit would be perfect if we introduce one more field transforming as a 1, that is, a singlet 
under 5(7(5) and hence a fortiori a singlet under 50(3) ® 50(2) <g> 0(1). In other words, 
this field does not participate in the strong, weak, and electromagnetic interactions, or in 
plain English, it describes a lepton with no electric charge and is not involved in the known 
weak interaction. Thus, this field can be identified as the “long lost” antineutrino field v c L . 
This guy does not listen to any of the known gauge bosons. 

Recall that we are using a convention in which all fermion fields are left handed, and 
hence we have written v c L . By a conjugate transformation, as explained earlier, this is 
equivalent to the right handed neutrino field v R . 

Since v R is an 5(7(5) singlet, we can give it a Majorana mass M without breaking 5(7(5). 
Hence we expect M to be larger than or of the same order of magnitude as the mass scale 
at which 5(7(5) is broken, which as we saw in chapter VII.5 is much higher than the mass 
scales that have been explored experimentally. This explains why v R has not been seen. 

On the other hand, with the presence of v R we can have a Dirac mass term m(v L v R + 
h.c.). Since this term breaks SU (2) <g> (7(1) just like the mass terms for the quarks and 
leptons we know, we expect m to be of the same order of magnitude 2 as the known quark 
and lepton masses (which for reasons unknown span an enormous range). 

2 Explicitly, with v R now available we can add to the SU(2) 0 U (1) theory of chapter VII.2 the term f'v^R^L, 
where ip = r 2 <pi. In the absence of any indication to the contrary, we might suppose that /' is of the same order 
of magntiude as the coupling / that leads to the electron mass. 
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Thus, in the space spanned by (v, v c ) we have the (Majorana) mass matrix 
/ 0 m\ 

M=[ (16) 

\m M) 

with M^> m. Since the trace and determinant of A4 are M and —m 2 , respectively, A4 has a 
large eigenvalue ~ M and a small eigenvalue ~ m 2 /M. A tiny mass ~ m 2 /M, suppressed 
relative to the usual quark and lepton masses by the factor m/M, is naturally generated 
for the (observed) left handed neutrino. This rather attractive scenario, known as the 
seesaw mechanism for obvious reason, was discovered independently by Minkowski and 
by Glashow, and somewhat later by Yanagida and by Gell-Mann, Ramond, and Slansky. 

Again, the tight fit of the 5* and the 10 of SU (5) inside the 16 + of 50(10) has convinced 
many physicists that it is surely right. 


A binary code for the world 

Given the product form of the y matrices in (5) and ( 6 ), and hence of ay , we can write the 
states of the spinor representations as 

l®i®2 •■•«„> ( 17 ) 

where each of the s’ s takes on the values ±1. For example, for n = 1, iq | + } = | — ) and 
X\ | — } = | + ), while t 2 I + ) = i \ — } and r 2 1 — ) = — i | + ). From (9) we see that 

/ FIVE I®1®2 •••««> = i n "=l e ;) I®1®2 •••««> ( 18 ) 

The right handed spinor S + consists of those states |e 1 e 2 •••£„) with (I”I" £ y -) = +1, and 
the left handed spinor S~ those states with (n'j =1 £ ; ) = —1. Indeed, the spinor represen¬ 
tations have dimension 2 " _1 . 

Thus, in SO (10) unification the fundamental quarks and leptons are described by a five- 

bit binary code, with states such as | + H-h ) and | —I-). Personally, I find 

this a rather pleasing picture of the world. 

Let us work out the states explicitly. This also gives me a chance to make sure that 
you understand the group theory presented in this chapter. Start with the much simpler 

case of 50(4). The spinor 5 + consists of | + +) and |-) while the spinor S~ consists 

of | H—) and | —F). As discussed in chapter II.3, 50(4) contains two distinct 50(2) 
subgroups. Removing a few factors of i from the discussion in chapter II.3 we see that 
the third generator of SU (2), call it ct 3 , can be taken to be either er 12 — 0-34 or er 12 + ct 34 . 
The two choices correspond to the two distinct SU (2) subgroups. We choose (arbitrarily) 
ct 3 = j (<r 12 — cr 34 ). From (5) and ( 6 ) we have er 12 = i y 1 y 2 = i (rj <g> r 3 ) (r 2 <g> r 3 ) = — r 3 ® 1 and 
er 34 = — 1 <g> r 3 , and so ct 3 = 2 (—r 3 ® 1 + 1 ® r 3 ). To figure out how the four states | + +), 

|-}, | H—}, and | —F) transform under our chosen SU (2), let us act on them with er 3 . 

For example, 

031 + +) — \ (—r 3 ® 1 + 1 ® r 3 ) | + +) = \ (—1 + 1) | + +) = 0 
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and 

°3 I F) — \ (~ r 3 ® 1 + 1 ® T 3 ) I h) = 2 (1 + 1) | f) = | f) 

Aha, under 50(2) | + +> and |-> are two singlets while | H—> and | —H) make up a 

doublet. 

Note that this is consistent with the generalization of (14) and (15), namely that upon 
the restriction of SO(2n) to O ( n ) the spinors decompose as 

5+ [0] © [2] © ■ • ■ (19) 

and 

5“ [1] © [3] © - - - (20) 

I have not indicated the end of the two sequences: A moment’s reflection indicates that it 
depends on whether n is even or odd. In our example, n — 2 ,and thus 2 + -> [0] © [2] = 1 © 1 
and 2~ -> [1] = 2. Similarly, for n — 3, upon the restriction of 50(6) to O(3), 4 + -> 
[0] © [2] = 1 © 3* and 4“ -> [1] ® [3] = 3 © 1. (Our choice of which triplet representation 
of O(3) to call 3 or 3* is made to conform to common usage, as we will see presently.) 

We are now ready to figure out the identity of each of the 16 states such as | + H-h ) 

in 50(10) unification. First of all, (18) tells us that under the subgroup 50(4) <g> 50(6) of 
50(10) the spinor 16 + decomposes as (since n^ =1 e ; - = +1 implies £j£ 2 — £ 3 £ 4 £s) 

16+^ (2+, 4+) 0(2“, 4“) (21) 

We identify the natural 50(2) subgroup of 5 O (4) as the 5 O (2) of the electroweak interac¬ 
tion and the natural 50 (3) subgroup of 50(6) as the color 50(3) of the strong interaction. 
Thus, according to the preceding discussion, (2 + , 4 + ) are the 50 (2) singlets of the stan¬ 
dard 0(1) ® 50(2) <g> 50(3) model, while (2“, 4“) are the 50(2) doublets. Here is the 
lineup (all fields being left handed as usual): 

50(2) doublets: 

v= I- +-> 

e = | H-) 

u = | —|—h H ), | |-H-h), and | 1-h + } 

d = | H-FH— ), | H-1-h), and | H-h + } 


50(2) singlets: 

u c = | H—FH-}, | + H-1— ), and | H—I-h } 

d c =|-- + -->, |-H— ), and |-+ >. 


I assure you that this is a lot of fun to work out and I urge you to reconstruct this 
table without looking at it. Here are a few hints if you need help. From our discussion 
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of SU (2) I know that v = | —b£ 3 £ 4 £ 5 ) and e~ = \ H—e 3 e 4 e 5 }, but how do I know that 
£3 = s 4 = e 5 = — 1? First, I know that £ 3 £ 4 £ 5 = — 1. I also know that 4“ —>• 3 © 1 upon 

restricting 5 O ( 6 ) to color 5 U (3). Well, of the four states |->, |+H— >, |H -b }, 

and | —H + } the “odd man out” is clearly |-}. By the same heuristic argument, 

among the 16 possible states | + + + + + > is the “odd man out” and so must be v c . 

There are lots of consistency checks. For example, once I identify v — | —|->, 

e~ — | H->, and v c — | H—|—|—|—h ), I can figure out the electric charge Q, 

which, since it transforms as a singlet under color 50(3), must have the value Q — 
asi + bs 2 + c(fi 3 + £ 4 + £ 5 ) when acting on the state |£ 1 £ 2 £ 3 £ 4 £ 5 }- The constants a, b, and 
c can be determined from the three equations Q(v) = —a + b — 3c = 0, Q(e~ ) = —1, and 
Q(v c ) — 0. Thus, Q = -\e l -\- i (£ 3 + £ 4 + £ 5 ). 

Living in the computer age, I find it intriguing that the fundamental constituents 
of matter are coded by five bits. You can tell your condensed matter colleagues that 

their beloved electron is composed of the binary strings H-and-b + +• An 

intriguing possibility 3 suggests itself, that quarks and leptons may be composed of five 
different species of fundamental fermionic objects. We construct composites, writing a 
+ if that species is present, and a — if it is absent. For example, from the expression for 
Q given above, we see that species 1 carries electric charge — \, species 2 is neutral, and 
species 3, 4, and 5 carry charge 1. A more or less concrete model can even be imagined by 
binding these fundamental fermionic objects to a magnetic monopole. 

I emphasize that particles transforming in 16 _ , such as | H-b + + }, have not been 

observed experimentally. 


A speculation on the origin of families 

One of the great unsolved puzzles in particle physics is the family problem. Why do quarks 
and leptons come in three generations {v e , e, u, d}, {u ;x , /x, c, £}, and{v r , r, t, b}? The way 
we incorporate this experimental fact into our present day theory can only be described 
as pathetic: We repeat the fermionic sector of the Lagrangian three times without any 
understanding whatsoever. Three generations living together gives rise to a nagging family 
problem. 

Our binary code view of the world suggests a wildly speculative (perhaps too speculative 
to mention in a textbook?) approach to the family problem: We add more bits. To me, 
a reasonable possibility is to “hyperunify” into an 50(18) theory, putting all fermions 
into a single spinorial representation S + — 256 + , which upon the breaking of 50(18) to 
50(10) <g> 50(8) decomposes as 

256 + -*■ (16 + , 8 + ) ® (16 _ , 8 _ ) (22) 

We have a lot of 16 + ’s. Unhappily, we see that group theory [see also (21)] dictates that we 
also get a bunch of unwanted 16"’s. One suggestion is that Nature might repeat the trick 


3 For further details, see F. Wilczek and A. Zee, Phys. Rev. D25: 553, 1982, Section IV. 
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She uses with color SU (3), whose strong force confines fields that are not color singlets 
(chapter VI1.3). Interestingly, we can exploit a striking feature of 5 O ( 8 ), which some people 
regard as the most beautiful of all groups. In particular, the two spinorial representations 
8 ± have the same dimension as the vectorial representation 8 " (the equation 2' !_1 = 2 n 
has the unique solution n — 4). There is a transformation that cyclically rotates these three 
representations 8 + , 8 _ , and 8 " into each other (in the jargon, the group SO(8) admits an 
outer automorphism). Thus, there exists a subgroup S 0(5) of S0(8) such that when we 
break 5<9( 8 ) into that SO( 5) 8 + behaves like 8 “ while 8 “ behaves like a spinor, namely 

8+-»5©l©l©l and 8“ -»4 © 4* (23) 

If we call this 4 SO (5) hypercolor and assumes that the strong force associated with it con¬ 
fines all fields that are not hypercolor singlets, then only three 16 + ’s remain! Unfortunately, 
as the relevant physics occurs in the energy regime above grand unification, our knowl¬ 
edge of the dynamics of symmetry breaking is far too paltry for us to make any further 
statements. 


Charge conjugation 


The product ® notation we use here allows us to construct the conjugation matrix C explic¬ 
itly. By definition C _ 1 er*.C = — er (J - (so that C changes e‘ 0i i {T 'i into its complex conjugate.) 
From (2), (3), and (4) we see that we can construct 


c ( n +1 ) _ 


C M ® Tj 

C (n) ® t 2 


if n odd 
if n even 


(24) 


You can check that this gives C~ 1 y*C = (— V) n Yj an d hence the desired result. 

Explicitly, C is a direct product of an alternating sequence of Tj and r 2 and so we deduce 
an important property. Acting on |e 1 e 2 • • • s n ), C flips the sign of all the s’ s. Thus C 
changes the sign of (n" =1 £ ; ) for n odd, and does not for n even. For n odd, the two 
spinor representations S + and S~ are conjugates of each other, while for n even, they 
are conjugates of themselves, or in other words, they are real. This can also be seen 
directly from C - 1 y FIVE C = (—l)"y EIVE . You can check this with all the explicit examples 
we have encountered: 5(9(2), 5<9(4), SO( 6 ), SO( 8 ), 5(9(10), and 5(9(18). See also 
exercise VII.7.3. 


Anomalies 


What about anomalies in 5 0(2n) grand unification? According to the discussion in chap¬ 
ter VII.5 we have to evaluate A^ k,mn = tr(7 ' 7 { J kl , /'""}) over the fermion representation. 


4 The reader savvy with group theory would recognize that S O (5) is isomorphic with the symplectic group 
Sp( 4) and that the Dynkin diagram of S 0(8) is the most symmetric of all. 
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Applying an S0(2n) transformation J‘> 0 T J 1 ’ O we see easily that A ,jklmn is an invari¬ 
ant tensor. Can we construct an invariant 6-index tensor with the appropriate symmetry 
properties (e.g., A'' klmn = — A-> lklmn ) in SO(2n )? We can’t, except in S0(6), for which we 
have e l i klmn . Thus, A‘ jk,mn vanishes except in S0(6 ), where it is proportional to s'J klmn . 
An elegant one line proof that any grand unified theory based on SO (2 n) for n ^ 3 is free 
from anomaly! 

The cancellation of the anomaly between 5* and 10 at the end of chapter VII.5 doesn’t 
seem so miraculous any more. Miracles tend to fade away as we gain deeper understanding. 

Amusingly, by discussing a physics question, namely whether a gauge theory is renor- 
malizable or not, we have discovered a mathematical fact. What is so special about S 0(6)? 
See exercise VII.7.5. 


Exercises 


VI 1.7.1 Work out the Clifford algebra in d-dimensional space for d odd. 

VII.7.2 Work out the Clifford algebra in d-dimensional Minkowski space. 

VII.7.3 Show that the Clifford algebra for d = 4k and for d = 4k + 2 have somewhat different properties. (If you 
need help with this and the two preceding exercises, look up F. Wilczek and A. Zee, Phys. Rev. D25: 553, 
1982.) 

VI I.7.4 Discuss the Higgs sector of the 50(10). What do you need to give mass to the quarks and leptons? 

VII.7.5 The group 50(6) has 6(6 — l)/2 = 15 generators. Notice that the group SU( 4) also has 4 2 — 1 = 15 
generators. Substantiate your suspicion that 50(6) and 50(4) are isomorphic. Identify some low 
dimensional representations. 

VI I.7.6 Show that (unfortunately) the number of families we get in 5 0(18) depends on which subgroup of 50 (8) 
we take to be hypercolor. 

VI I.7.7 If you want to grow up to be a string theorist, you need to be familiar with the Dirac equation in various 
dimensions but especially in 10. As a warm up, study the Dirac equation in 2-dimensional spacetime. 
Then proceed to study the Dirac equation in 10-dimensional spacetime. 
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XXIII Tj Gravity as a Field Theory 
VIII* I and the Kaluza-Klein Picture 


Including gravity 


Field theory texts written a generation ago typically do not even mention gravity. The 
gravitational interaction, being so much weaker than the other three interactions, was 
simply not included in the education of particle physicists. The situation has changed with 
a vengeance: The main drive of theoretical high energy physics today is the unification of 
gravity with the other three interactions, with string theory the main candidate for a unified 
theory. 

From a course on general relativity you would have learned about the Einstein-Ffilbert 
action for gravity 

5 = Ifob / d4x ^ R = J d*xV=gM 2 p R (1) 

where g = det g pv denotes the determinant of the curved metric g pv of spacetime, R is 
the scalar curvature, and G is Newton’s constant. Let me remind you that the Riemann 
curvature tensor 


fi ;« =3 - r ;- 9 t r ;+ r ; r : 


- r° 


r 


X 

K(T 


is constructed out of the Riemann-Christoffel symbol (recall chapter 1.11): 


( 2 ) 


r ^, = \g Xp (dv8p^ + 3 pgpv - dpgnv) ( 3 ) 

The Ricci tensor is defined by R^ — R v liVK and the scalar curvature by R = g ,lv R /il; . Varying 
S gives us 1 the Einstein field equation 


R nv - \gpv R = -^GT llv 


(4) 


The Einstein-Hilbert action is uniquely determined if we require the action to be coor¬ 
dinate invariant and to involve two powers of spacetime derivative. As you can see from 


1 See, e.g., S. Weinberg, Gravitation and Cosmology, p. 364. 
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(2) and (3) the scalar curvature R involves two powers of derivative and the dimensionless 
field g jlv and thus has mass dimension 2. Hence G -1 must have mass dimension 2. The 
second form in (1) emphasizes this point and is often preferred in modern work on grav¬ 
ity. (The modified Planck mass M P = 1/^/16jtG differs from the usual Planck mass by a 
trivial factor, much like the relation between h and H.) 

The theory sprang from Einstein’s profound intuition regarding the curvature of space- 
time and is manifestly formulated in terms of geometric concepts. In many textbooks, 
Einstein’s theory is developed, and rightly so, in purely geometric terms. 

On the other hand, as I hinted back in chapter 1.6, gravity can be treated on the same 
footing as the other interactions. After all, the graviton may be regarded as just another 
elementary particle like the photon. The action (1), however, does not look anything like 
the field theories we have studied thus far. I will now show you that in fact it does have the 
same kind of structure. 


Gravity as a field theory 

Let us write g^ v = rj^ v + h^ v , where rj^ n , denotes the flat Minkowski metric and h llv the 
deviation from the flat metric. Expand the action in powers of h^ v . In order not to drown 
in a sea of Lorentz indices, let us suppress them for a first go-around. Merely from the 
fact that the scalar curvature R involves two derivatives 3 in its definition, we see that the 
expansion must have the schematic form 

S = [ d A x—^— (dhdh +hdhdh + h 2 dhdh + •• •) (5) 

J 167rG 

after dropping total divergences. As I remarked in chapter 1.11, the field h^ix) describes 
a graviton in flat space and is to be treated like any other field. The first term dhdh, which 
governs how the graviton propagates, is conceptually no different than the first term in the 
action for a scalar field dcpdcp or for the photon field dAdA. The terms cubic and higher in 
h determine the interaction of the graviton with itself. 

The Einstein-Hilbert action in the weak field expansion is structurally reminiscent of the 
Yang-Mills action, which may be written in schematic form as S — J d A x(\/g 2 ){d Ad A + 
A 2 dA + A 4 ). As I explained in chapter IV.5, we understand the self interaction of the Yang- 
Mills bosons physically: The bosons themselves carry the charge to which they couple. 
We can understand the self interaction of the graviton similarly: The graviton couples to 
anything carrying energy and momentum, and it certainly carries energy and momentum. 
In contrast, the photon does not couple to itself. 

We say that Yang-Mills and Einstein theories are nonlinear, while Maxwell theory is 
linear. The former are hard, the latter easy. 

But while the Yang-Mills action terminates, the Einstein-Hilbert action, because of the 
presence of v /—g and of the inverse of g^ v , is an infinite series in the graviton field li^ v . 

The other major difference is that while Yang-Mills theory is renormalizable, gravity is 
notoriously nonrenormalizable, as we argued by dimensional analysis in chapter III.2. We 
are now in a position to see this explicitly. Consider the self energy correction to the graviton 
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(a) 


(b) 


Figure VIII.1.1 


propagator shown in figure VIII.1.1a. We see from the second term in (5) that the three- 
graviton coupling involves two powers of momentum. Thus the Feynman integral goes as 
f d A k(kkkk/k 2 k 2 ), with four powers of k in the numerator from the two vertices and four 
powers in the denominator from the propagators. Taking out two powers of momentum to 
extract the coefficient of dhdh, we see that the correction to 1/G is quadratically divergent. 
Because of the explicit powers of momentum in the coupling, the divergence gets worse 
and worse as we go to higher and higher order. Compare figure VIII.1.1b to la: We have 
three more propagators, worth ~ 1 /k 6 , and one more loop integration / d A k, but two more 
vertices ~ k A . The degree of divergence goes up by 2. Of course we already knew all this 
by dimensional analysis. 

As mentioned in chapter 1.11 the fundamental definition 


T^ v (x) = 


2 SS M 
*J^g Sgnv(x) 


tells us that coupling of the graviton to matter (in the weak field limit) can be included by 
adding the term 


- J d A x jh flv T liv (6) 

to the action, where T stands for the (flat spacetime) stress-energy tensor of all the matter 
fields of the world, a matter field being any field that is not the graviton field. Thus, with 
the inclusion of matter (5) is modified schematically 2 to 


S = 


I 


d A x[ 


167rG 


(dhdh + hdhdh + Iddhdh + •••) + (hT + ■ ■ ■)] 


(?) 


In chapter IV.5 I noted that we can bring Yang-Mills theory into the same convention 
commonly used in Maxwell theory by a trivial rescaling A -> gA. Similarly, we can also 
bring Einstein theory into the same convention by rescaling the graviton field h liv —>• 
VGh^ so that the action becomes (to ease writing we absorb 167T into G whenever we 
feel like it) 

S = [ d A x (dhdh + VGhdhdh + Gh 2 dhdli + • • • + VGhT) 


2 If this is to represent an expansion of S in powers of h, then strictly speaking, if we display the terms cubic 
and quartic in the Einstein-Hilbert action, we should also display the contribution coming from the terms of 
higher order in h contained in T flv (x) = —(2/^/—g)SS M /8g flv (x). 
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We see explicitly that Vl67rG — 1 / M P measures the strength of the graviton coupling to 
itself and to all other fields. Once again, the enormity of M P (compared to the scale of the 
strong interaction, say) indicates the feebleness of gravity. 

Here we expanded g^ v around a flat metric but we could just as well expand g^ v = 
8 /j.v + h„ v , with g pv a curved metric, that of a black hole (see chapter V.7) for instance. 

Determining the weak field action 

After this index free survey we are ready to tackle the indices. We would like to determine 
the first term dhdh in (7) so that we can obtain the graviton propagator. Thus, we have to 
expand the action S = M 2 f d 4 x*J—gg^ lv R flv up to and including order h 2 . From (2) and 
(3) we see that the Ricci tensor R /lv starts in 0(h) so that it suffices to evaluate V — 88^ t° 
0(h). That’s easy: As we have already seen in chapter 1 . 11, g — — [1 + ); /iy /r Aty + 0(h 2 )] and 
giiv _ () /iv _ fofiv _j_ so that ^/— gg^ v — r] flv — h ,lv + \ ri llv h + 0(h 2 ), where we have 

defined h = rj^ lv h^ v . We now must calculate R jlv to 0(h 2 ), a straightforward but tedious 
task starting from (2) and (3). 

In line with the spirit of this book, which is to avoid tedious calculation whenever 
possible, I will now show you how to get around this. We invoke symmetry considerations! 

Under a general coordinate transformation x^ —>• x' IJ = — e ,x (x) the metric changes 

to g' llv — (dx'^/dx a )(dx' v /dx T )g aT . Plugging in g = r] llv — h^ v + • • •, lowering the in¬ 
dices (with t] to this order), and using (dx'^/dx 0 ) = 8£ — d a s^, we find, treating 3 s v as 
of the same order as h ,lv : 

h'^ = h pv + 3 + 3 u e /i (8) 

Note the structural similarity to the electromagnetic gauge transformation A 1 = A p — 
d p A. Very nice! We will explore the sense in which gravity can be regarded as a gauge 
theory in more detail later. 

We are looking for the terms in the action quadratic in h and quadratic in 3. Lorentz 
invariance tells us that there are four possible terms (To see this, first write down terms 
with the indices on the two 3 matching, then the terms with the index on a 3 matching an 
index on an h, and so on): 

S = J d 4 x(ad k h^ v d x h pv + bd x h»d k h v v + cdji^d^h^ + dh]d^ d v h jlv ) 

with four unknown constants a, b, c, and d. Now vary S with 8h^ v = d fl s v + 3 v e M , inte¬ 
grating by parts freely. For example, 

8(d k h^d x h^) = 2[3,(23'V)](3 a / V )“ = ”4s v d V/ V 

Since there are three objects linear in h, linear in s, and cubic in 3 (namely e v d 2 d v h and 
e v d v d x d^h Xll in addition to the one already shown) the condition SS — 0 gives three equa¬ 
tions, just enough to fix the action up to an overall constant, corresponding to Newton’s 
constant. The invariant combination turns out to be 

1 = - \d k h»d x hl - + d v h k d lx h )lv 


(9) 
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Thus, even if we had never heard of the Einstein-Hilhert action we could still determine 
the action for gravity in the weak field limit by requiring that the action be invariant under 
the transformation ( 8 ). This is hardly surprising since coordinate invariance determines 
the Einstein-Hilbert action. Still, it is nice to construct gravity “from scratch.” 

Referring to ( 6 ), we can now write the weak field expansion of S as 

without having to expand R to 0(h 2 ). The coefficient of 2" is fixed by the requirement that 
we reproduce the usual Newtonian gravity (see later). 

The graviton propagator 

As we anticipated in (5) the action 5 w f g indeed has the same quadratic structure of all 
the field theories we have studied, and so as usual the graviton propagator is just the 
inverse of a differential operator. But just as in Maxwell and Yang-Mills theories the relevant 
differential operator in Einstein-Hilbert theory does not have an inverse because of the 
“gauge invariance” in ( 8 ). 

No problem. We have already developed the Faddeev-Popov method to deal with this 
difficulty. In fact, for my limited purposes here, to derive the graviton propagator in flat 
spacetime, I don’t even need the full-blown Faddeev-Popov formalism with ghosts and all . 3 
Indeed, recall from chapter III.4 that for the Feynman gauge (£, = 1) we simply add (3 A) 2 
to the invariant 

iF^F^ = 3 ^A v (d^A v - 9 V A„)“ = ” - A'V.aV - ( 9 A) 2 

thus canceling the last term. Inverting the differential operator —rj^d 2 we obtain the 
photon propagator in the Feynman gauge —i r] flv /k 2 . We play the same “trick” for gravity. 
After staring at 

X = ±9^V/v - \\h^th v v - 9+ 9 v ^9%v 

for a while, we see that by adding (3 — \d v h^) 2 we can knock off the last two terms in 

I so that S w f g effectively becomes 

5 wfg = f d'x 1 - (d^d^ - UMh) ~ (10) 

In other words, the freedom in choosing h flv in ( 8 ) allows us to impose the so-called 
harmonic gauge condition 

h<=\*X (U) 

(the linearized version of d ll (*J—gg IJ ' v ) — 0.) 

3 This is because (8) does not involve the field h^ v , just as in the Maxwell case but unlike the Yang-Mills 
case. Since we do not intend to calculate loop diagrams in quantum gravity, we do not need the full power of the 
Faddeev-Popov method. 
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Writing (10) in the form 

5 = 32 be / d4 ' X [ htlVK ^-9 2 )h ka + Of/7 3 )] 
we see that we have to invert the matrix 


R fixrXa — 2 (^liXhvo T h,urVvX Vfix>hXo( 

regarding ixv and Xa as the two indices. Note that we have to maintain the symmetry of 
h ,lv . In other words, we are dealing with matrices acting in a linear space spanned by 
symmetric two-index tensors. Thus, the identity matrix is actually 


I 


_ l 
/xv;Axr — 2 


(v nlVvo + V^ahyx) 


You can check that K llv]krj K kn f/) — v;pa) so that K 1 = K. Thus, in the harmonic gauge 
the graviton propagator in flat spacetime is given by (scaling out Newton’s constant) 


D 


/XV, A<7 


_ 1 h^xhva 
~ 2 


+ h^vX - V^Xo 
k 2 + is 


( 12 ) 


Newton from Einstein 

Varying (10) with respect to h^ lv we obtain the Euler-Lagrange equation of motion 4 
(— 2d 2 h^ v + rj^ v d 2 h) — T IJV = 0. Talcing the trace, we find d 2 li — 16 ttGT (with T = 
r] lllv T llv ) and so we obtain 5 

8^ =-16 nGiT^-^T) (13) 

In the static limit, T 00 is the dominant component 6 of the stress-energy tensor and (13) 
reduces to V 2 (/> = AjtGTqq upon recalling from chapter 1.5 that the Newtonian gravitational 
potential </> = l/7 00 . We have just derived Poisson’s equation for </>. 

Incidentally, this suggests another way of avoiding the tedious task of expanding the 
Einstein-Hilbert action (and hence R ) to 0(h 2 ) if you are willing to accept the Einstein 
field equation (4) as given. You need expand R flv only to 0(h) to obtain (13) from (4), and 
from (13) you can reconstruct the action to 0(h 2 ). Indeed, from (2) and (3) you easily get 

R/iv — \(—d 2 hfi V + dfidxhy + dydih^ — d^dyh^) + 0(h 2 ) -*■ — \d 2 h^ v + OQl 2 ) 

with the further simplification in harmonic gauge. But this is not quite fair since con¬ 
siderable technology 7 (Palatini identity and all the rest) is needed to derive (4) from (1). 


4 Note that the flat spacetime energy momentum conservation = 0 together with the equation of motion 

implies d 2 (d fJ 'h flv — \d v h) = 0. 

5 Thus, the Einstein equation in vacuum R^ v = 0 reduces to 3 2 h flv = 0; hence the name “harmonic.” 

6 Note that, in contrast to r 0 Q, h 0Q does not dominate the other components of h /IV . 

7 See S. Weinberg, Gravitation and Cosmology, pp. 290 and 364. 
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Einstein’s theory and the deflection of light 


Consider two particles with stress-energy tensors T[^' and respectively interacting via 
the exchange of a graviton. The scattering amplitude is then (up to some overall constant 
not essential for our purposes here) given by 

rpflV p. / 1 \ r-r’ko /") 'T' /C V r-p T T \ 

UI (1) U lLV,l*rWl(2) - 7 ( 2 )mv - 7 ( 1 ) 7 ( 2 )) 

For nonrelativistic matter T 00 is much larger than the other components T 0 - 7 and T'i (as 
I have just remarked), so the scattering amplitude between two lumps of nonrelativistic 
matter (say, the earth and you) is proportional to 

G /^t.OO't-OO _ 't’OO't’OO', _ ^7 t-OOt.00 

( 1 ) ( 2 ) (D < 2) 7 - ^ 2 J ( 1 ) J ( 2 ) 

As explained way back in chapters 1.4 and 1.5, the interaction potential is given by the 
Fourier transform of the scattering amplitude, namely 

G ff d\xd 3 x'T^ 00 (x)T^ 00 (x') J 

and thus for two well-separated objects we recover the Newtonian potential GM (1) M (2 )/r. 

We are now able to address the issue raised at the end of chapter 1.5. Suppose a particle 
theorist. Dr. Gravity, wants to propose a theory of gravity to rival Einstein’s theory. Dr. G 
claims that gravity is due to the exchange of a spin 2 particle with a teeny mass m G coupled 
to the stress-energy tensor T^ v . In chapter 1.5 we worked out the propagator of a massive 
spin 2 particle, namely 


D S ^ a (k) = ±(G^G V(T + G, ia G vX - |G^G^/ffc 2 - m 2 c + is) 


with G^ v = rj — k^ l k v /mj } (after a trivial notational adjustment). Since the particle is 
coupled to a conserved source k /1 T^ lv = 0 we can replace G^ v by rj /lv . Thus, in the limit 
m G — y 0 we have the propagator 


^spin 2 tI 1 ^lukVvcr "F *1 [in V vk 3 11 




(14) 


Compare this with (12). Dr. G’s propagator differs from Einstein’s: \ versus 1. Remarkably, 
gravity is not generated by an almost massless spin 2 particle. The discontinuity” 
between (12) and (14) was discovered in 1970 independently by Iwasaki, by van Dam and 
Veltman, and by Zakharov. 

In Dr. G’s theory (with his own gravitational coupling G g ), the interaction between two 
particles is given by 


G C T^D 


W( k K° = r ( 2 )Mv - ~T m T m ) 


For two lumps of nonrelativistic matter this becomes 


^_G_ n r oo r oo _ 2 t oo t oo,_ 4 Gq 00 00 
2*2 ^ ( 1 ) ( 2 ) 3 J ( 1 ) J ( 2 ) 2 - 3 2*2 ( D ( 2 ) 

Dr. G simply takes his G g — |G and his theory passes all experimental tests. 
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But wait! There is also the famous 1919 observation of the deflection of starlight by 
the sun, and the photon is definitely not a lump of nonrelativistic matter. Indeed, recall 
from chapter 1.11 (or from your course on electromagnetism) that T = vanishes for 
the photon. Thus, talcing T^ 1 ’ and T[^ to be the stress-energy tensor of the sun and of the 
photon respectively, Einstein would have for the scattering amplitude (G/2k 2 )2T^ T (2)flv 
while Dr. G would have (G G /2k 2 )2T (1)lxv = | (G /2k 2 )2T^T^ )lv . Dr. G would have 
predicted a deflection angle of 3 GM/R instead of 4 GM/R (with M and R the mass 
and radius of the sun). On the Brazilian island of Sobral in 1919 Einstein triumphed 
over Dr. G. 

As explained in chapter 1.5, while a massive spin 2 particle has 5 degrees of freedom the 
massless graviton has only 2. (I give an analysis of the helicity ±2 structure of one graviton 
exchange in appendix 2.) The 5 degrees of freedom may be thought of as consisting of the 
helicity ±2 degrees of freedom we want plus 2 helicity ±1 and a helicity 0 degrees of 
freedom. The coupling of the helicity ± 1 degrees of freedom vanishes because — 0. 

Thus, effectively, we are left with an extra scalar coupling to the trace T = ii liV T ,lv of the 
stress-energy tensor; as we can see plainly the discrepancy indeed resides in the last term 
of (12) and (14). 

You should be disturbed that a measurement of the deflection of starlight can show 
that a physical quantity, the graviton mass m G , is mathematically zero rather than less 
than some extremely small value. This apparent paradox was resolved by A. Vainshtein in 
1972. 8 He found that Dr. G’s theory contains a distance scale 



in the gravitational field around a body of mass M. The helicity 0 degree of freedom 
becomes effective only on the distance scales r r v . Inside the Vainshtein radius r v , the 
gravitational field is the same as in Einstein’s theory and experiments cannot distinguish 
between Einstein’s and Dr. G’s theories. With the current astrophysical bound m G 
(10 24 cm) -1 and M the mass of the sun, r v comes out to be much larger than the size 
of the solar system. In other words, the apparent paradox arose because of an interchange 
of limits: We can take either the characteristic distance of the measurement r 0 t, s (the radius 
of the sun in the deflection of starlight) or the Vainshtein radius r v to infinity first. 

So all is well: Dr. G’s theory is consistent with current measurements provided that 
he takes m G small enough. What he is not allowed to do is use the one graviton exchange 
approximation. Instead, he should solve the massive analog of Einstein’s field equation (4) 
around a massive body such as the sun, as Vainshtein did. This is equivalent to expanding 
to all orders in the graviton field h and resumming: In Feynman diagram language we 


8 A. I. Vainshtein, Phys. Lett. 39B:393,1972; see also C. Deffayet, G. Dvali, G. Gabadadze, and A. I. Vainshtein, 
Phys. Rev. D65:044026, 2002. 
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have an infinite number of diagrams corresponding to the sun emitting 1, 2, 3, • • •, 00 
gravitons respectively. The paradox is formally resolved by noting that the higher orders 
are increasingly singular as m G -> 0. 


The gravity of light 

At this point, you are ready to do perturbative quantum gravity: You have the graviton 
propagator (12), and you can read off the interaction between gravitons from the detailed 
version of (7) and the interaction between the graviton and any other field from the term 
— \h^ v T' lv . The only trouble is that you might “drown in a sea of indices” if you don’t watch 
out, as I have already warned you. 

I know of one calculation (in fact one of my favorites in theoretical physics) in which 
we can beat the indices down easily. An interesting question: Einstein said that light 
is deflected by a massive object, but is light deflected gravitationally by light? Tolman, 
Ehrenfest, and Podolsky discovered that in the weak field limit two light beams moving 
in the same direction do not interact gravitationally, but two light beams moving in the 
opposite directions do. Surprising, eh? 

The scattering of two photons ki + k 2 -* p\ + P 2 via the exchange of a graviton is given 
by the Feynman diagram in figure VIII. 1.2, with the momentum transfer q = /q — k\, plus 
another diagram with and P 2 interchanged. The Feynman rule for coupling a graviton 
to two photons can be read off from 

= -h^F^F^ - lr) flv F pk F pk ) 

but all we need is that the interaction involve two powers of spacetime derivatives 3 acting 
on the electromagnetic potential A so that the graviton-photon-photon vertex involves 
2 powers of momenta, one from each photon. Hence the scattering amplitude (with all 
Lorentz indices suppressed) has the schematic form ~ (k 1 p 1 )D(k 2 P 2 )- The ?;’s in the 
graviton propagator D tie the indices on (k 1 p 1 ) and (k 2 p 2 ) together. (We have suppressed 
the polarization vectors of the photons, imagining that they are to be averaged over in 
the amplitude squared.) Referring to (12), we see that the amplitude is the sum of three 
terms such as ~ (k x ■ p 1 )(k 2 ■ p 2 )/q 2 , ~ (h • h)iP\ • Pi)lq 2 , and ~ • p 2 Kk 2 • Pi)/q 2 - 
Since according to Fourier the long distance part of the interaction potential is given by 
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the small q behavior of the scattering amplitude, we need only evaluate these terms in the 
limit q —y 0. We can throw almost everything away! For example, 

k\ ' Pi —■► ki ' ki = 0 , ki ■ p 2 = k\ • (ki + k 2 — pi) — ■ k 2 

Just imagining contracting all those indices in our heads is good enough: We obtain the 
amplitude ~ (/q ■ k 2 )(P] ■ p 2 )/q 2 . 

If k 2 and k 2 point in the same direction, k\ ■ k 2 oc k 2 ■ k 2 — 0. Two photons moving in the 
same direction do not interact gravitationally. 

Of course, this result is not of any practical importance since electromagnetic effects are 
far more important, but this is not an engineering text. In appendix 1 I give an alternative 
derivation of this amusing result. 


Kaluza-Klein compactification 

You have probably read about how excited Einstein was when he heard of the proposal of 
Kaluza and of Klein to extend the dimension of spacetime to 5 and thus unify electromag¬ 
netism and gravity. The 5th dimension is supposed to be compactified into a tiny circle of 
radius a far smaller than what experimentalists can see; in other words, x 5 is an angular 
variable with x 5 = x 5 + In a . You have surely heard that string theory, at least in some ver¬ 
sion, is based on the Kaluza-Klein idea. Strings live in 10-dimensional spacetime, with 6 
of the dimensions compactified. 

I can now show you how the Kaluza-Klein mechanism works. Start with the action 

<I5) 

in 5-dimensional spacetime. The subscript 5 serves to indicate the 5-dimensional quanti¬ 
ties. We denote the 5-dimensional metric by g AB with the indices A and B running over 
0, 1, 2, 3, 5. 

Assume that g AB does not depend on x 5 . Plug into S, integrate over x 5 , and compute 
the effective 4-dimensional action. Since and the 4-dimensional scalar curvature R 
both involve two powers of 3 and g AB contains g^ v , we must have (exercise VIII.1.5) 
/?5 =/? + ■■ ■■ Thus, (15) contains the Einstein-Hilbert action with Newton’s gravitational 
constant G ~ G$/a. 

What else do we get? We don’t even have to work through the arithmetic. We can 
argue by symmetry. Under the 5-dimensional coordinate transformation x A —y x' A — 
x A + s A (x), we have [see ( 8 )] h' AB — h AB — 3 A s B — d B s A . Let us choose = 0 ands 5 (x) to 
be independent of x 5 : We go around and rotate each of the tiny circles attached to every point 
in our spacetime a tiny bit. Well, we have h'^ v — h /jLV and = h 55 , but /Y ;5 = h m5 — d^s^. 
But if we give the Lorentz 4-vector h^ and 4-scalar £5 new names, call them A^ and A, 
this just says A' — A^ — d^A, the usual electromagnetic gauge transformation! 

Since we know that the 5-dimensional action (15) is invariant under x A —> x' A — x A + 
e A (x), the resulting 4-dimensional action must be invariant under A^ -y A' — A jx — 3^ A 
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and hence must contain the Maxwell action. Note once again the power of symmetry 
considerations. No need to do tedious calculations. 

Electromagnetism comes out of gravity! 


Differential geometry of Riemannian manifolds 

I hinted earlier at a deep connection between general coordinate transformation and gauge 
transformation. Let us flesh this out by looking at differential geometry and gravity. For 
this sketch we will consider locally Euclidean (rather than Minkowslcian) spaces. 

The differential geometry of Riemannian manifolds can be elegantly summarized in 
the language of differential forms. Consider a Riemannian manifold (such as a sphere) 
with the metric g^ v (v). Locally, the manifold is Euclidean by definition, which means 

S„v(*) = el(x)& ab e b v {x) (16) 

where the matrix e(x ) may be thought of as a similarity transformation that diagonalizes 
g llv and scales it to the unit matrix. Thus, for a D -dimensional manifold there exist D 
“world vectors” e a (v) obviously dependent on x and labeled by the index a = \, 2, •••,£). 
The functions ( x ) are known as “vielbeins” (meaning “many legs” in German, vierbeins 
= four legs for D — 4, dreibeins = three legs for D — 3, and so on.) In some sense, the 
vielbeins can be thought as the “square root” of the metric. 

Let us clarify by a simple example. The familiar two-sphere (of unit radius) has the line 
element 9 ds 2 — dO 2 + sin 2 Qdq> 2 . From the metric (g se = 1, g v<p = sin 2 0) we can read off 

— 1 and e 2 — sin 9 (all other components are zero). We are invited to define D 1-forms 
e a — e^dx 11 . (In our example, e 1 = d9, e 2 — sin 9d(p.) 

On a curved manifold, when we parallel transport a vector, the vector changes when 
expressed in terms of the locally Euclidean coordinate frame. (This is just the familiar 
statement that on a curved manifold such as the surface of the earth the notion of a vector 
pointing straight north is a local concept: When we move infinitesimally away keeping our 
“north vector” pointing in the same direction, it will end up being infinitesimally rotated 
away from the “north vector” defined at the point we have just moved to.) This infinitesimal 
rotation of the vielbeins is described by 

de a = -co ab e b (17) 

Note that since &> generates an infinitesimal rotation it is an antisymmetric matrix: co ab = 
—co ba . Since de a is a 2-form, a> is a 1-form, known as the connection: It “connects” the 
locally Euclidean frames at nearby points. (Since the indices a, b, etc. are associated with 
the Euclidean metric S ah we do not have to distinguish between upper and lower indices. 
When we do write upper or lower indices a, b, etc. it is for typographical convenience.) In 

9 Note that this represents the square of an infinitesimal distance element and not an area element, and so a 
quantity such as d8 2 is literally the square of d6 and not the wedge product d8d8 (of chapter IV.4), which would 
have been identically zero. 
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the simple example of the sphere, de 1 = 0 and de 2 — cos 0 dO d<p and so the connection 
has only one nonvanishing component co 12 = —co 21 = — cos 6 dip. 

At any point, we are free to rotate the vielbeins: Ifyou use the vielbeins e a I am free to use 
some other vielbeins e'“ instead, as long as mine are related to yours by a rotation e a ^(x) = 
O a b (x)e'fa). [You can check that g^ v {x) = e a fi (x)S ab e b (x) = e'“(x)8 ah e[ b (x) if 0 T O = 1.] 
The connection co' is defined by de' a — —co' ab e' b . You can readily work out that (suppressing 
indices) 

co= Oco'O 7 - (d 0 ) 0 T ( 18 ) 


The local curvature of the manifold is a measure of how the connection varies from 
point to point. We would like the curvature to be invariant under the local rotation O (or at 
least to transform as a tensor so that by contracting it with vectors we can form a scalar). 
The desired object is the 2-form R ab = dco ab + co ac co cb . You can check that R — OR'0 T . 
(For the sphere, R 12 = dco 12 + co lc a> c2 — sin 6 d0 dip.) Written out in components, R ab — 
R ab v dx ,i dx v . I leave it to you to verify that is the usual Riemann curvature tensor 

R Xa where e 2 is the inverse of the matrix e?. In particular, R ab e^el is the scalar curvature, 

/JyV Cl A L [JLV Cl O 

which in our convention works out to be +1 for the sphere. 

Thus, Riemannian geometry can be elegantly summarized by the two statements (again 
suppressing indices) 

de + a>e = 0 (19) 

and 

R = da> + co 1 (20) 

Look familiar? You should be struck by the similarity between (20) and the expression 
for the field strength in nonabelian gauge theories F = d A + A 2 . Note transforms [see 
(18)] exactly the same way as the gauge potential A. But one nagging difference, namely 
the lack of an analog of e in gauge theory, has long bothered some theoretical physicists 
(but is shrugged off by most as inconsequential). Also, note that Einstein theory is linear 
in R while Yang-Mills theory is quadratic in F. 


Gravity and Yang-Mills 

We can make the connection between gravity and Yang-Mills theory more explicit by 
looking at the derivative of a vector field. Yang-Mills theory was born of the requirement 
that a field ip and its derivative 3 pip transform in the same way under a spacetime- 
dependent internal symmetry transformation (IV.5.1). In Einstein gravity a vector field 
W^ix) transforms as W ,tM (x') = S^(x)W v (x) with S^(x) = dx'^/dx 1 ’. Since the matrix S 
depends on the spacetime coordinate x, we see that 3 ^W 11 could not possibly transform 
like a tensor with one upper and one lower index, as we would like naively just by looking 
at indices. We would have to introduce a covariant derivative. Not surprisingly, this closely 
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parallels the discussion in chapter IV.5. Historically, Yang and Mills were inspired by 
Einstein gravity. 

Using the chain rule and the product rule, we have 




dW'^ix') 

dx a 


^JL[ s ^x)W v (x)\ = (S-XsZd p w v + [(s-b^W 1 ' 


( 21 ) 


Were the second term in (21), which comes from differentiating S, not there, the naive 
guess, that d } W 11 transforms like a tensor, would be valid. The fact that the transformation 
S varies from place to place has negated the naive guess. 

What is happening is quite clear: as the vector W varies from a given point to a neighbor¬ 
ing point, the coordinate axes that define the components of W also change. This suggests 
that we could define a more suitable derivative, called the covariant derivative and written 
as D X W^ L , to take this effect into account, so that D^W 11 would indeed transform like a 
tensor. Exactly as in Yang-Mills theory (IV.5.1), we have to add an extra term to knock out 
the second term in (21). 

Just the way the indices hang together immediately suggests the correct construction. 
The factor (5 _1 )^9 p 5'JJ in the unwanted second term in (21) has one upper index and two 
lower indices, so we need an object with this set of indices. Lo, the Riemann-Christoffel 
symbol in (3) (and introduced in chapter 1.11) fits the bill perfectly. I will let you have 
the fun of verifying that the covariant derivative defined by 


D l W ll = d l W^ + r^ v W v 


( 22 ) 


indeed transforms like a tensor (note that T was normalized correctly for this purpose). 

I end with a technical remark about the coupling of gravity to spin \ fields. First, 
we of course have to Wick rotate so that the vierbein e a erects a locally Minkowskian 
rather than a Euclidean coordinate frame. The indices a, b, etc. are now contracted with 
the Minkowskian metric rj ab . The slight subtlety is that the Dirac gamma matrices y a 
are associated with the Lorentz rotation of the vierbein (x) — O a b (x)e rb (x') and thus 
carry the Lorentz index a rather than the “world” index fi. Similarly, the Dirac spinor 
1 jr(x) is defined relative to the local Lorentz frame specified by the vierbein, and thus 
its covariant derivative has to be defined in terms of the connection co rather than the 
symbol T. Hence the flat space Dirac action f d^x^/iiy^d^ — m)x[r must be general¬ 
ized to / d 4 x^/—gir(iy a ri ab e blx V ll — m)-^r, where the covariant derivative — d^x/r — 
^a> liab <7 ah \lr expresses the rotation of the local Lorentz frame as we move from a point x to 
a neighboring point. In contrast to the action for integer spin fields in curved spacetime 
(see chapter 1.11), the Dirac action in curved spacetime involves the vierbein explicitly. 


Appendix 1: Light on light again 


The stress-energy tensor T^ v of a light beam moving in the x-direction has four nonzero components: the 
energy density T 00 of course, then T 0x = T 00 since photons carry the same energy and momentum, next 
T x0 = T 0x by symmetry, and finally T xx = T 00 since the stress-energy tensor of the electromagnetic field is 
traceless (chapter 1.11). Without having to solve Einstein's equations in the weak field limit (13) explicitly we 
know immediately that h 00 = h 0x = h x0 = h xx = h. The metric around the light beam is given by goo = 1 + h, 
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Soi = Sx o = — h, and g xx = —1 + h (and of course g yy = g zz = —1 plus a bunch of vanishing components). 
Consider a photon moving parallel to the light beam. Its worldline is determined by (recall chapter 1.11) 

d 2 x p _ ,, dx p dx v 

~dt}-~ p "~d;~dt; 

Let’s calculate d 2 y/df 2 and d 2 z/df 2 with (dy/dt;), (dz/dt) <K (dt/dt;), (dx/di;). Using (3) we find (with fi, v 
restricted to 0, x) 


d 2 y 

dt ; 2 


1 

2 


(xKSy/j, “h dfiSyv 




dx p dx v 
dt; d f 


= -jO y h) 


,<¥+<¥ 

L d; d f 


^ dt dx 
d f dt, _ 


1 

2 


dt 

(dyh)( — 

dt; 


dXyl 

diV 


For a photon moving in the same direction as the light beam dt = dx and d 2 v/d f 2 = d l z/d; 1 = 0. We have once 
again derived the Tolman-Ehrenfest-Podolsky effect. Note we never had to solve for h. 

Incidentally, if you are a bit unsure of dt = dx , the condition ds = 0 for a light beam moving in the x -direction 
amounts to (1 + h)dt 2 — Ihdtdx — (1 — h)dx 2 = 0. Upon division by dt 2 we obtain — (1 + h) + 2 hv + (1 — h)v 2 = 
0, with v = dx/dt. The quadratic equation has two roots v = q=(l d= h)/(l — h). The negative root gives v = \, 
and thus for a photon moving in the same direction as the light beam dx/dt = 1. In contrast, the positive root 
v = — (1 + h)/(l — h ) describes a photon moving in the opposite direction. 


Appendix 2: The helicity structure of gravity 


To gain a deeper understanding of the difference between Einstein’s and Dr. G’s theories let us look at the 
helicity structure of the interaction in the two cases. To warm up, consider the interaction between two conserved 
currents due to the exchange of a spin 1 particle of momentum k and mass m : Jj^J( 2 )^ = 7^ J® 2 ) ~ J(\) J( 2 ) • U se 
current conservation k^J^ = 0 to eliminate J° = k l J 1 /co (with co = k°). We obtain (k l W/ao 2 — <$ I-/ )T ( U J^y Let k 
point in the 3rd direction and use k 2 = co 2 — m 2 to write this as —[(m 2 /co 2 )J^J^ + + *^(i)^( 2 )] - We see 

that as m —> 0 the longitudinal component of the current J 3 indeed decouples as explained in chapter 11.7 and 
we obtain — showing explicitly that the photon has helicity ±1. (Obvious notation: 

J 1+l2 = J 1 + i J 2 etc.) 

Onward to gravity. Consider the interaction ~ ^(i)^( 2 )» where £ = \ for Einstein and \ for Dr. G. 

For ease of writing I will now omit the subscripts (1) and (2). Conservation k^T^ = 0 allows us to eliminate 
T° l = k^T^ 1 /co and T 00 = W k l T-> 1 /(o 2 . Again taking k to point in the 3rd direction we obtain the mess 


-Vr 3 ¥ 3 + 2 


-£ 



^y!3'j-'13 _|_ j-23 j-23) yllyll y22'j-'22 


y33 _|_ yll _j_ j-22 


— | r 33 + r 11 

CO , 


+ 2T U T 12 
+ T 22 1 


which simplifies in the limit m -y 0 to 


yllyll _|_ -j-'22'j-'22 2y^l2yl2 


- Z(T U + T 22 )(T n + T 22 ) 


In Einstein’s theory, { = 1 and this becomes 


1 ^2^11_ y22^y , ll_ y22^ _|_ 2^12^12 


whichlo and beholdis equal to 2(7’ 1+,2 ’ 1 +i 2 7' 1 i2,1 ,1 + T 1 ,21 l2 r 1+,2,1+ ' 2 ), showing that indeed the graviton 
carries helicity ±2. In Dr. G’s theory, this would not be the case. 
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Exercises 


VI 11 .i .1 Work out T^ v for a scalar field. Draw the Feynman diagram for the contribution of one-graviton exchange 
to the scattering of two scalar mesons. Calculate the amplitude and extract the interaction energy between 
two mesons sitting at rest, thus deriving Newton’s law of gravity. 

VIIl.i. 2 Work out T for the Yang-Mills field. 

VIIl.i.3 Show that if h^ v does not satisfy the harmonic gauge, we can always make a gauge transformation with 
s v determined by d 2 s v = — jd v h k so that it does. All of this should be conceptually familiar from 

your study of electromagnetism. 

VII l.i. 4 Count the number of degrees of polarization of a graviton. [Hint: Consider a plane wave h^ v {x) = 
hfi V (k)e ikx just because it is a bit easier to work in momentum space. A symmetric tensor has 10 
components and the harmonic gauge k^h^ = \k v h £ imposes 4 conditions. Oops, we are left with 6 
degrees of freedom. What is going on?] [Hint: You can make a further gauge transformation and still 
stay in the harmonic gauge. The graviton should have only 2 degrees of polarization.] 


VII l.i. 5 The Kaluza-Klein result that we argued by symmetry considerations can of course be derived explicitly. 
Let me sketch the calculation for you. Consider the metric 

ds 2 = g^ v dx^dx v — a 2 [d0 + A /1 (x)dx 11 ] 2 


where 6 denotes an angular variable 0 < 0 < 2n . With A^ = 0, this is just the metric of a curved spacetime, 
which has a circle of radius a attached at every point. The transformation 9 —> 6 + A (x) leaves ds invariant 
provided that we also transform A^{x) —> A^(x) — d^A^x). Calculate the 5-dimensional scalar curvature 
R 5 and show that R s = R 4 — \a l F lxv F lx ' v . Except for the precise coefficient \ this result follows entirely 
from symmetry considerations and from the fact that R 5 involves two derivatives on the 5-dimensional 
metric, as explained in the text. After some suitable rescaling this is the usual action for gravity plus 
electromagnetism. Note that the 5-dimensional metric has the explicit form 


5 , 8imv -a 2 A„A„ 


Sab ~ 


—a 2 A,, 


~a lA n \ 
-a 2 ) 


(23) 


VII l.i. 6 Generalize the Kaluza-Klein construction by replacing the circles by higher dimensional spheres. Show 
that Yang-Mills fields emerge. 


VII l.i. 7 Starting with the connection 1-form co 12 = — cos 6d(p for the sphere, show that the scalar curvature is a 
constant independent of 6 and (p . 

VII l.i. 8 The vielbeins for a spacetime with Minkowski metric is defined by g^ix) = e a ^{x)ri ab e^{x), where the 
Minkowski metric r] ab replaces the Euclidean metric S ab■ The indices a and b are to be contracted with 
r] ab . For example, R ab = dco ab + co ac rj cd (o db . Show that everything goes through as expected. 



VI11.2 


The Cosmological Constant Problem 
and the Cosmic Coincidence Problems 


The force that knows too much 

The word paradox has been debased by loose usage in the physics literature. A real paradox 
should involve a major and clear-cut discrepancy between theoretical expectation and 
experimental measurement. The ultraviolet catastrophe, for example, is a paradox, the 
resolution of which around the dawn of the twentieth century ushered in quantum physics. 
I now come to the most egregious paradox of present day physics. 

The electromagnetic force knows about the particles carrying charge, and the strong 
force knows about the particles carrying color. And the gravitational force? It knows 
everybody! More precisely, anybody carrying energy and momentum. 

Within a particle physics frame of mind, which is the only frame of mind we have 
in exploring the fundamental structure of physics, the graviton can be regarded as just 
another particle. Indeed, given that a massless spin 2 particle couples to the stress-energy 
tensor, one can reconstruct Einstein’s theory. 

Nevertheless, there is an uncomfortable feel to this whole picture. Gravity has to do with 
the curvature of spacetime, the arena in which all fields and particles live. The graviton is 
not just another particle. 

This in essence is the root origin 1 of the paradox of the cosmological constant. The 
graviton is not just another particle—it knows too much! 


The cosmological constant 

In the absence of gravity, the addition of a constant A to the Lagrangian £ -> £ — A has 
no effect whatsoever. In classical physics the Euler-Lagrange equations of motion depend 

1 For more along this line, see A. Zee, hep-th/0805.2183 in Proceedings of the Conference in Honor of C. N. Yang’s 
85th Birthday, World Scientific, Singapore 2008, p. 131. 
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only on the variation of the Lagranjpan. In quantum field theory we have to evaluate the 
functional integral Z = / Dipe' f d xC(x \ which upon the inclusion of A merely acquires 
a multiplicative factor. As we have seen repeatedly, a multiplicative factor in Z does not 
enter into the calculation of Green’s function and scattering amplitudes. 

Gravity, however, knows about A. Physically, the inclusion of A corresponds to a shift 
in the Hamiltonian H —> H + f d i x A. Thus, the “cosmological constant” A describes a 
constant energy or mass per unit volume permeating the universe, and of course gravity 
knows about it. 

More technically, the term in the action — f d 4 x A is not invariant under a coordinate 
transformation x -* x'{x). In the presence of gravity, general coordinate invariance re¬ 
quires that the term — / d 4 x A in the action S be modified to — f d 4 x«Jg A, as I explained 
way back in chapter 1.11. Thus, the gravitational field g /lv knows about A, the infamous 
cosmological constant introduced by Einstein and lamented by him as his biggest mistake. 
This often quoted lament is itself a mistake. The introduction of the cosmological constant 
is not a mistake: It should be there. 


Symmetry breaking generates vacuum energy 

In our discussion on spontaneous symmetry breaking, we repeatedly ignored an additive 
term /x 4 /4A. that appears in C. 

Particle physics is built on a series of spontaneous symmetry breaking. As the universe 
cools, grand unified symmetry is spontaneously broken, followed by electroweak symme¬ 
try breaking, then chiral symmetry breaking, just to mention a few that we have discussed. 
At every stage a term like /x 4 / 4X appears in the Lagrangian, and gravity duly takes note. 

How large do we expect the cosmological constant A to be? As we will see, for our 
purposes the roughest order of magnitude estimate suffices. Let us take X to be of order 
1. As for /x, for the three kinds of symmetry breaking I just mentioned, [x is of order 10 17 , 
10 2 , and 1 Gev, respectively. We thus expect the cosmological constant A to be roughly 
fx 4 — /x/(/x -1 ) 3 , where the last form of writing /x 4 reminds us that A is a mass or energy 
density: An energy of order /x packed into a cube of size /x -1 . But this is outrageous even if 
we take the smallest value for /x: We know that the universe is not permeated with a mass 
density of the order of 1 Gev in every cube of size 1 (Gev) -1 . 

We don’t have to put in actual numbers to see that there is a humongous discrepancy be¬ 
tween theoretical expectation and observational reality. If you want numbers, the current 
observational bound on the cosmological constant is ;S (10 -3 ev) 4 . With the grand unifi¬ 
cation energy scale, we are off by (17 + 9 +3) x 4 — 116 orders of magnitude. This is the 
mother of all discrepancies! 

With the Planck mass Mpi ~ 10 19 Gev the natural scale of gravity, we would expect 
A ~ M 4 j if it is of gravitational origin. We are then off by 124 orders of magnitude. We 
are not talking about the crummy calculation of some pitiful theorist not fitting some 
experimental curve by a factor of 2. 
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We can imagine the universe starting out with a negative cosmological constant, fined 
tuned to cancel the cosmological constant generated by the various episodes of sponta¬ 
neous symmetry breaking. Or there must be a dynamical mechanism that adjusts the 
cosmological constant to zero. 

Notice I say zero, because the cosmological constant problem is basically an enormous 
mismatch between the units natural to particle physics and natural to cosmology. Measured 
in units of Gev 4 the cosmological constant is so incredibly tiny that particle physicists 
have traditionally assumed that it must be zero and have looked in vain for a plausible 
mechanism to drive it to zero. One of the disappointments of string theory is its inability 
to resolve the cosmological constant problem. As of the writing of this chapter around the 
turn of the millennium, the brane world scenarios (chapter 1.6) have generated a great deal 
of excitement by offering a glimmer of a hope. Roughly, the idea is that the gravitational 
dynamics of the larger space that our universe is embedded in may cancel the effect of the 
cosmological constant. 


Cosmic coincidence 

But Nature has a big surprise for us. While theorists racked their brains trying to come up 
with a convincing argument that A = 0, observational cosmologists steadily refined their 
measurements and discovered dark energy. The “cleanest” explanation of dark energy by 
far is that it represents the cosmological constant. Assuming that this is the case (and 
who knows?), the upper bound on the cosmological constant would be changed to an 
approximate equality 

A ~ (10 _3 ev) 4 !!! (1) 

The cosmological constant paradox deepens. Theoretically, it is easier to explain why some 
quantity is mathematically 0 than why it happens to be ~ 10 -124 in the units natural (?) to 
the problem. 

To make things worse, (10 -3 ev) 4 happens to be the same order of magnitude as the 
present matter density of the universe p M . More precisely, dark energy accounts for~ 74% 
of the mass content of the universe, dark matter for ~ 22%, and ordinary matter for ~ 4%. 
First, the ordinary matter we know and love is reduced to an almost negligibly small 
component of the universe. Second, why should p M be comparable to A to within a factor 
of 3? This is sometimes referred to as the cosmic coincidence problem. 

Now the cosmological constant A is, within our present understanding, a parameter in 
the Lagrangian. On the other hand, since most of the mass density of the universe resides 
in rest mass, as the universe expands p M (t) decreases as [1 /R{t)f, where R(t) denotes 
the scale size of the universe. 2 In the far past, p M was much larger than A, and in the 


2 For an easy introduction to cosmology, see A. Zee, Unity of Forces in the Universe, vol. II, chap. 10. 
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far future, it will be much smaller. It just so happens that, in this particular epoch of the 
universe, when you and I are around, p M ~ A. Or to be less anthropocentric, the epoch 
when p M ~ A happens to be when galaxy formation has been largely completed. 

Very bizarre! 

In their desperation, some theorists have even been driven to invoke anthropic 
selection. 3 


3 For a recent review, see A. Vilenkin, hep-th/0106083. 



VIII.3 


Effective Field Theory Approach 
to Understanding Nature 


Low energy manifestation 

The pioneers of quantum field theory, Dirac for example, tended to regard field theory 
as a fundamental description of Nature, complete in itself. As I have mentioned several 
times, in the 1950s, after the success of quantum electrodynamics many leading particle 
physicists rejected quantum field theory as incapable of dealing with the strong and weak 
interactions, not to mention gravity. Then came the great triumph of field theory in the early 
1970s. But after particle physicists retrieved field theory from the dust bin of theoretical 
physics, they realized that the field theories they were studying might be “merely” the low 
energy manifestation of a deeper structure, a structure first identified as a grand unified 
theory and later as a string theory. Thus was developed an outlook known as the effective 
field theory approach, pace Dirac. 

The general idea is that we can use field theory to say something about physics at low 
energies or equivalently long distances even if we don’t know anything about the ultimate 
theory, be it a theory built on strings or some as yet undreamed of structure. An important 
consequence of this paradigm shift was that nonrenormalizable field theories became 
acceptable. I will illuminate these remarks with specific examples. 

The emergence of this effective field theory philosophy, championed especially by Wil¬ 
son, marks another example of cross fertilization between condensed matter and particle 
physics. Toward the late 1960s, Wilson and others developed a powerful effective field the¬ 
ory approach to understanding critical phenomena, culminating in his Nobel Prize. The 
situation in condensed matter physics is in many ways the opposite of that in particle phys¬ 
ics at least as particle physics was understood in the 1960s. Condensed matter physicists 
know the short distance physics, namely the quantum mechanics of electrons and ions. 
But it certainly doesn’t help in most cases to write down the Schrodinger equation for the 
electrons and ions. Rather, what one would like to have is an effective description of how 
a system would respond when probed at low frequency and small wave vector. A striking 
example is the effective theory of the quantum Hall fluid as described in chapter VI.2: The 
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relevant degree of freedom is a gauge field, certainly a far cry from the underlying elec¬ 
tron. As in the a model description (chapter VI.4) of quantum chromodynamics, it is fair 
to say that without experimental guidance theorists would have a terribly hard time de¬ 
ciding what the relevant low energy long distance degrees of freedom might be. You have 
seen numerous other examples in condensed matter physics, from the Landau-Ginzburg 
theory of superconductivity to Peierls instability. 


The threshold of ignorance 

In our discussion of renormalization, I espouse the philosophy that a quantum field 
theory provides an effective description of physics up to a certain energy scale A, a 
threshold of ignorance beyond which physics not included in the theory comes into 
play. In a nonrenormalizable theory, various physical quantities that we might wish to 
calculate will come out dependent on A, thus indicating that the physics at or beyond 
the scale A is essential for understanding the low energy physics we are interested in. 
Nonrenormalizable theories suffer from not being totally predictive, but nevertheless they 
may be useful. After all, the Fermi theory of the weak interaction described experiments 
and even foretold its own demise. 

In a renormalizable theory, various physical quantities come out independent of A, 
provided that the calculated results are expressed in terms of physical coupling constants 
and masses, rather than in terms of some not particularly meaningful bare coupling 
constants and masses. Low energy physics is not sensitive to what happens at high 
energies, and we are able to parametrize our ignorance of high energy physics in terms of 
a few physical constants. 

From the late 1960s to the 1970s, one main thrust of fundamental physics was to 
classify and study renormalizable theories. As we know, this program was “more than 
spectacularly successful.” It allowed us to pin down the theory of the strong, the weak, and 
the electromagnetic interactions. 


Renormalization group flow and dimensional analysis 

The effective field theory philosophy is intrinsically tied to renormalization group flow. 
In a given field theory, as we flow toward low energies, some couplings may tend to zero 
while others do not (and if they tend to infinity as in QCD, then we are unable to figure 
out the effective theory without experimental input). Thus, the first step is to calculate the 
renormalization group flow. A simple example is given in exercise VIII.3.1. 

In many cases, we can simply use dimensional analysis. As I explained in our earlier 
discussion on renormalization theory, couplings with negative dimensions of mass are 
not important at low energies. To be specific, suppose we add a g(p 6 term to a X<p 4 theory. 
The coupling g has the dimension of inverse mass squared. Let us define M 2 = 1 /g. At 
low energies, the effect of the g(p b term is suppressed by (E/M) 2 . 
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How do we understand Schwinger’s spectacular calculation of the anomalous magnetic 
moment of the electron in the effective field theory philosophy? 

Let me first tell the traditional (i.e., pre-Wilsonian) version of the story. A student could 
have asked, “Professor Schwinger, why didn’t you include the term (1/ M)\js<j ,lv iJ/F^ v in 
the Lagrangian?” 

The answer is that we better not. Otherwise, we would lose our prediction for the 
anomalous magnetic moment; it would depend on M. Recall that [\[r] — \ and [A^] = 1, 
and hence i Jra ,iv \jfF pv has mass dimension | + | + 1+1 = 5>4. The requirement of 
renormalizability, that the Lagrangian be restricted to contain operators of dimension 4 or 
less, provides the rationale for excluding this term. 

Actually, the “real” punchline of my story is that Schwinger probably would not have 
answered the question. When I took Schwinger’s field theory class, it was well known 
among the students that it was forbidden to ask questions. Schwinger would simply 
ignore any raised hands. There was no opportunity to ask questions after class either: 
As he uttered his last sentence of an invariably beautifully prepared lecture, he would sail 
majestically out of the room. Dirac dealt with questions differently. I was too young to have 
witnessed it, but the story goes that when a student asked, “Professor Dirac, I did not under¬ 
stand . . . ,” Dirac replied, “That is an assertion, not a question.” 

The modern retelling of the magnetic moment story turns it around. We now regard the 
Lagrangian of quantum electrodynamics as an effective Lagrangian which should include 
an infinite sequence of terms of ever higher dimensions, with coefficients parametrizing 
our threshold of ignorance. The physics of electrons and photons is now described by 

c = xjriiy^id^ - ieA p ) - m)f - ^F /lv F llv + + • • • 

Yes, the term (1/M) \j/a^ v -/r F^ v is there, with some unknown M having the dimension of 
a mass. Schwinger’s result, that quantum fluctuations generate a term 
(a/2jt)(l/2m e )x[ra^ v \l/F pv , should then be interpreted as saying that the anomalous 
magnetic moment of the electron is predicted to be [(a/2jt)(l/2m e ) + 1/M]. The close 
agreement of ( a/2jt)(l/2m e ) with the experimental value of the anomalous magnetic 
moment can then be turned around to set a lower bound on M (4jt/a)m e . 

Equivalently, Schwinger’s result predicts the anomalous magnetic moment of the elec¬ 
tron if we have independent evidence that M is much larger than [(a/2jt){l/2m e )]~ 1 . I 
want to emphasize that all of this makes total physical sense. For example, if you speculate 
that the electron has some finite size a, then you would expect M ~ 1/a. The anomalous 
magnetic moment calculation gives an upper bound for a , telling us that the electron must 
be pointlike down to some small scale. Alternatively, we could have had independent evi¬ 
dence, from electron scattering for example, that a has to be smaller than a certain length, 
thus giving us a lower bound on M. 

To underscore this point, imagine that in 1948 we followed Schwinger and quickly 
calculated the anomalous magnetic moment of the proton. We could literally have done 
it in 3 seconds, since all we have to do is replace m e by m p in the Lagrangian, thus 
obtaining (ot/27t)(\/2m p )\j/a^ v '/rF IJLV , which would of course disagree resoundingly with 
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experiment. The disagreement tells us that we had not included all the relevant physics, 
namely that the proton interacts strongly and is not pointlike. Indeed, we now know that 
the anomalous magnetic moment of the proton gets contributions from the anomalous 
magnetic moments and the orbital motion of the quarks inside the proton. 


Effective theory of proton decay 

It may seem that with the effective field theory approach we lose some predictive power. But 
effective field theories can also be surprisingly predictive. Let me give a specific example. 
Suppose we had never heard of grand unified theory. All we know is the SU (3) <g> SU(2) <g> 
U (1) theory. An experimentalist tells us that he is planning to see if the proton would decay. 

Without the foggiest notion about what would cause the proton to decay we can still 
write down a field theory to describe proton decay. The Lagrangian C is to be constructed 
out of quark q and lepton / fields and must satisfy the symmetries that we know. Three 
quarks disappear, so we write down schematically qqq , but three spinors do not a Lorentz 
scalar make. We have to include a lepton field and write qqql. 

Since four fermion fields are involved, the terms qqql have mass dimension 6 and so in 
£ they have to appear as (1 /M 2 )qqql with some mass M, corresponding to the mass scale 
of the physics responsible for proton decay. The experimental lower bound on the lifetime 
of the proton sets a lower bound on M. 

It is instructive to contrast this analysis with an (imagined) effective field theory analysis 
of proton decay long before the concept of quarks was invented, say around 1950. We would 
construct an effective Lagrangian out of the available fields, namely the proton field p, 
the electron field e, and the pion field tv, and thus write down the dimension 4 operator 
f pe + 7t° with some dimensionless constant /. To estimate /, we would naively compare 
this operator with the one describing pion-nucleon coupling (chapter IV.2) gpmt + in the 
effective Lagrangian. Since f pe + 7t° violates isospin invariance, we might expect / ~ ag, 
namely the same order as g multiplied by some measure of isospin breaking, say the fine 
structure constant. But this would give an unacceptably short lifetime to the proton. We 
are forced to set / to a ridiculously small number, which seems highly unnatural. Thus, at 
least in hindsight, we can say that the extremely long lifetime of the proton almost points 
to the existence of quarks. The key, as we saw above, is to promote of the mass dimension 
of the term in the effective Lagrangian responsible for proton decay from 4 to 6. (Can the 
cosmological constant puzzle be solved in the same way?) 

Another way of saying this is that SU(5) ® SU(2) ® U(1) plus renormalizability predicts 
one of the most striking facts of the universe, the stability of the proton. In contrast, the 
old pion-nucleon theory glaringly failed to explain this experimental fact. 

In accordance with our philosophy, £ must be invariant under SU (3) <8> SU (2) <g> U(l), 
under which quark and lepton fields transform rather idiosyncratically, as we saw in 
chapter VII.5. To construct £ we have to sit down and list all Lorentz invariant SU( 3) ® 
SU (2) ® 17(1) terms of the form qqql. 
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Sitting down, we would find that, assuming only one family of quarks and leptons for 
simplicity, there are only four terms we can write down for proton decay, which I list here 
for the sake of completeness: ( l L Cq L )(u R Cd R ), (e R Cu R )(q L Cq L ), (l L Cq L )(q L Cq L ), and 
(. e R Cu R )(u R Cd R ). Here l L — { " e ) j and qj = ( d ), denote the lepton and quark doublet 
of SU(2) <g> U{1), the twiddle is defined by /- 7 = /,-e'- 7 with SU( 2) indices i, j — 1,2 (see 
appendix B), and C denotes the charge conjugation matrix. Color indices on the quark 
fields are contracted in the only possible way. The effective Lagrangian is then given by the 
sum of these four terms, with four unknown coefficients. 

The effective field theory tells us that all possible baryon number violating decay pro¬ 
cesses can be determined in terms of four unknowns. We expect that these predictions will 
hold to an accuracy of order (M w /M) 2 . (If M w were zero, SU (3) <g> SU (2) <g> U(l) would 
be exact.) 

Of course, we can increase our predictive power by making further assumptions. For 
example, if we think that proton decay is mediated by a vector particle, as in a generic grand 
unified theory, then only the first two terms in the above list are allowed. In a specific grand 
unified theory, such as the SU (5) theory, the two unknown coefficients are determined in 
terms of the grand unified coupling and the mass of the X boson. 

To appreciate the predictive power of the effective field theory approach, inspect the 
list of the four possible operators. We can immediately predict that while proton decay 
violates both baryon number B and lepton number L, it conserves the combination B — L. 
I emphasize that this is not at all obvious before doing the analysis. Could you have told 
the experimentalist which of the two possible modes n —>• e + n~ or n —> e~7t + he should 
expect? A priori, it could well be that B + L is conserved. 

Note that Fermi’s theory of the weak interaction would be called an effective field theory 
these days. Of course, in contrast to proton decay, beta decay was actually seen, and the 
prediction from this sort of symmetry analysis, namely the existence of the neutrino, was 
triumphantly confirmed. 

Along the same line, we could construct an effective field theory of neutrino masses. 
Surely one of the most exciting experimental discoveries in particle physics of recent 
years was that neutrinos are not massless. Let us construct an SU(2) ® C7(1) invariant 
effective theory. Since v L resides inside l L , without doing any detailed analysis we can see 
that a dimension-5 operator is required: schematically l L l L contains the desired neutrino 
bilinear but it carries hypercharge F/2 = — 1; on the other hand, the Higgs doublet <p 
carries hypercharge + \, and so the lowest dimensional operator we can form is of the 
form llcpcp with dimension j + | + l+ l = 5. Thus, the effective C must contain a term 
(1 /M)ll(p<p, with M the mass scale of the new physics responsible for the neutrino mass. 
Thus, by dimensional analysis we can estimate m v ~ mj/M, with ni/ some typical charged 
lepton mass. If we take nij to be the muon mass ~ 10 2 Mev and m v ~ 10 _1 ev, we find 
M ~ (10 2 Mev) 2 /10 -1 (10 -6 Mev) = 10 8 Gev. 

The philosophy of effective field theories valid up to a certain energy scale A seems 
so obvious by now that it is almost difficult to imagine that at one time many eminent 
physicists demanded much more of quantum field theory: that it be fundamental up to 
arbitrarily high energy scales. 
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Indeed, we now regard all quantum field theories as effective field theory. For all we 
know, spacetime on some short distance does consist of a lattice, and so the Yang-Mills 
action is but the leading term in an expansion of the Wilson lattice action. The Einstein- 
Hilbert Lagrangian, being nonrenormalizable, is a fortiori “merely" the leading term in 
an effective field theory 

C = + M 2 p R + Cl R 2 + c 2 R^ v + c i R llvap R IA ' vap + -^{d, Ri + •■■) + •••) 

Here c x2 3 and ^1 are dimensionless numbers presumably of order 1. The three terms 
quadratic in the curvature involve four powers of derivatives versus the two powers in 
the Einstein-Hilbert term, and hence their effects relative to the leading terms are sup¬ 
pressed by ( E/M P ) 2 with E an energy scale characteristic of the process we are studying. 
Thus, these so-called Weyl-Eddington terms could be safely ignored in any conceivable 
experiment. [A technical aside: The Gauss-Bonnet theorem implies that the combination 
(R 2 — 4R /lv R IJV + R llV(y pR^ v,yp ) is a total derivative, so c 3 can be effectively set to 0, but that 
is besides the point here.] We have indicated only one representative dimension 6 term 
R 3 (out of many). Its coefficient, in accordance with high school dimensional analysis, is 
suppressed by two powers of some mass M. 

What do we expect the mass scale M to be? Suppose we live in a universe with only gravity 
(and of course we don’t, actually) then once again, we could risk being presumptuous and 
take M to be the intrinsic mass scale of gravity, namely the Planck mass M P , but we have 
not yet recovered from our third-degree burn from supposing that M A ~ M P . If we could 
ignore the cosmological constant problem for a moment, then the standard (but quite 
possibly wrong!) consensus is that in a universe of pure gravity our theory of gravity is an 
effective expansion in powers of (E/M P ) 2 . 

Alternatively, we could treat C as the effective theory of gravity after we integrate out all 
the matter degree of freedom. In that case, M would be of order m e (imagine gravitons 
coupled to an electron loop; see exercise VIII.3.5), or perhaps even m v (generated by a 
neutrino loop). 


Effective field theory of the blue sky 

As another application of the effective field theory philosophy, consider the scattering of 
electromagnetic waves on an electrically neutral spinless particle described by a scalar field 
4>. Since O is neutral, the lowest dimension gauge invariant term that can be added to 
C — 3<t>? 30 + m 2 O i O + • ■ ■ is (1/M 2 )0? <£>F lxv F pv . A factor of 1/M 2 , with M some mass 
scale, has to be included with the dimension l+l + 2 + 2 = 6 operator to bring the high 
school dimension down to 4. The two powers of derivative in F pv F p,v tell us immediately 
that the amplitude for photon scattering on this neutral particle goes like Ai oc a> 2 , with 
a) the frequency of the electromagnetic wave. Thus we conclude that the scattering cross 
section varies like a (&>) oc &> 4 . 



458 | VIII. Gravity and Beyond 


We have arrived at Rayleigh’s celebrated explanation of the color of the sky. In passing 
through the atmosphere red light scatters less than blue light on air molecules and hence 
the sky is blue. 

For application to spinless atoms or molecules, we can pass to the nonrelativistic limit 
as described in chapter III.5, setting <J> = (l/\/2 m)e~ lml <p, so that the effective Lagrangian 
now reads 

C = (p f ido<P - ^-di^dtcp + -2—(p^cp( Cl E 2 - C 2 B 2 ) + • • • 

2m mM l 

In this case, since we understand the microscopic physics governing atoms and 
molecules, we know perfectly well what the mass scale M represents. The coupling of 
a photon to an electrically neutral system such as an atom or a molecule must vanish like 
the characteristic size d of the system, since as d —> 0 the positive and negative charges are 
on top of each other, giving a vanishing net coupling to the photon. Rotational invariance 
implies that the coupling ~k-d. The scattering amplitude then goes like M. oc (cod) 2 , 
since the coupling has to act twice, once for the incoming photon and once for the out¬ 
going photon. (Note that by rotational invariance the expectation value of the operator d 
vanishes, but we are doing second order perturbation theory so that we have to evalu¬ 
ate the expectation value of a quantity quadratic in d.) 1 Squaring M and invoking some 
elementary quantum mechanics and dimensional analysis, we obtain the cross section 
cr(co) ~ d b co 4 . 


Appendix: Reshuffling terms in effective field theory 


The Lagrangian of an effective field theory consists of an infinite sequence of terms arranged in an orderly 
progression of higher and higher mass dimension, constrained only by the assumed symmetries of the theory. 
In fact, some terms could be effectively eliminated. To explain this, we focus on a toy example: 

C = -idcp) 1 — hp A + h( fl <p 6 -I- b(p 2 d 2 tp + c(d 2 cp) 2 ) + O ( 2— J ( 1 ) 

2 M 1 \M A ) 

We are secretly dealing with the action and thus we freely integrate by parts. For arithmetical simplicity, we did 
not include a mass term, so that to leading order in 1/M the equation of motion reads simply 3 2 (p = 0. The three 
possible dimension 6 terms are shown explicitly [we integrate by parts to get rid of the term (p 2 {d(p) 2 }. 

Are we allowed to use the equation of motion to eliminate the two dimension 6 terms that are proportional 
to 3 2 (p} 

We know that we could make a field redefinition without changing the on shell amplitudes, so let us rede¬ 
fine <p -► cp + (1/M 2 )F. Then \{d(p ) 2 (p) 2 - (l/M 2 )Fd 2 <p + 0(1/M 4 ) and X(p 4 X((p 4 + (l/M 2 )<p 3 F + 

0(1/M 4 )). Set F = p(p 3 + qd 2 (p. We see that with an appropriate choice of p and q we can cancel off b and c. 
Notice that in the process we also change a to some other value. 

The answer to the question is yes, but the naive statement that the equation of motion 3 2 (p = 0 empowers us 
to simply set 3 2 (p to zero in the nonleading terms in the effective field theory is, legalistically speaking, incorrect, 
or at least misleading. We see that we actually generated 0(1/M 4 ) terms and changed the (p 6 term. Thus, more 
correctly, a field redefinition allows us to shuffle terms around and to higher order. The net effect, however, is 
the same as if we trusted the naive statement and set 3 2 (p to zero in the nonleading terms. 


1 For details, see, for example, J. J. Sakurai, Advanced Quantum Mechanics , Addison-Wesley, New York, 1967, 


p. 47. 
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This procedure works for fermions also. As an example, consider the effective Lagrangian C = ^(iy^d^ — 
m)\J/ + (1/ — m)\j/(\jf\Jf) + • • •. Then the field redefinition i/r —>■{//• — (l/2M 3 )Vf(^"V r ) gets rid of the 
dimension 7 term shown. 

We could also apply what we just learned to the effective theory of gravity if without any understanding we 
set the cosmological constant to zero. Also, use the Gauss-Bonnet theorem to get rid of the R tlvap R tlv<Tp term, so 
that we have 

C = \/—g ^MpR + c^R 2 + c 2 R flv R flv + -—^(diR 3, + •••) + ** (2) 

Make a field redefinition g^ v —> g^ v + 8g^ v and use 

S j d A xJ=gR = - J d A xJ=g(R' lv - ^g My ^)5g MV 

Set Sgpy = pR /xv + qg^ v R. Then we can cancel off C\ and c 2 with a judicious choice of p and q. I emphasize that 
this works only if we set the cosmological constant to zero without any ado. 


Exercises 


Vlll. 3.1 Consider 

C=\ [(S^) 2 + (3 (ft,) 2 ] - K<p{ + <p%) - g<p\vl (3) 

We have taken the 0(2) theory from chapter 1.10 and broken the symmetry explicitly. Work out the 
renormalization group flow in the (X — g) plane and draw your own conclusions. 

VI1 1 . 3.2 Assuming the nonexistence of the right handed neutrino field v R (i.e., assuming the minimal particle 
content of the standard model) write down all SU (2) 0 U (1) invariant terms that violate lepton number 
L by 2 and hence construct an effective field theory of the neutrino mass. Of course, by constructing a 
specific theory one can be much more predictive. Out of the product l L l L we can form a Lorentz scalar 
transforming as either a singlet or triplet under SU(2). Take the singlet case and construct a theory. [Hint: 
For help, see A. Zee, Phys. Lett. 93B: p. 389, 1980.] 

VI1 1 . 3.3 Let A, B, C , D denote four spin \ fields and label their handedness by a subscript: y^A h = hA h with 
h = ±1. Thus, A + is right handed, A_ left handed, and so on. Show that 

(Ai,B h )(C_ h D_ h ) = —\(A h y^ D_ h )(C_ h y lx B h ) (4) 

This is an example of a broad class of identities known as Fierz identities (some of which we will need in 
discussing supersymmetry.) Argue that if proton decay proceeds in lowest order from the exchange of a 
vector particle then only the terms (l L Cq L )(u R Cd R ) and (e R Cu R )(qiCq L ) are allowed in the Lagrangian. 

VI1 1 . 3.4 Given the conclusion of the previous exercise show that the decay rate for the processes p —> n + + v, 
p —> n° + e + , n —> + v, and n —> n~ + e + are proportional to each other, with the proportionality 

factors determined by a single unknown constant [the ratio of the coefficients of (l L Cq L )(u R Cd R ) and 
(e R Cu R )(q L Cq L )\. 

For help on these last three exercises see S. Weinberg, Phys. Rev. Lett. 43: 1566, 1979; F. Wilczek and 
A. Zee, ibid. p. 1571; H. A. Weldon and A. Zee, Nucl. Phys. B173: 269, 1980. 

VI1 1 . 3.5 Imagine a mythical (and presumably impossible) race of physicists who only understand physics at 
energies less than the electron mass m e . They manage to write down the effective field theory for the one 
particle they know, the photon, 

C = -]f imv F» v + ± [a (F Fn 2 + b(F F^) 2 } + ■■■ 

4 m 4 


( 5 ) 
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with F pv = ^£ pvpa F pa the dual field strength as usual and a and b two dimensionless constants 
presumably of order unity. 

(a) Show that C respects charge conjugation (A —> — A in this context), parity, and time reversal, (and 
of course gauge invariance.) 

(b) Draw the Feynman diagrams that give rise to the two dimension 8 terms shown. The coefficients 
a and b were calculated by Euler and Kockel in 1935 and by Heisenberg and Euler in 1936, quite a 
feat since they did not know about Feynman diagrams and any of the modern quantum field theory 
set up. 

(c) Explain why dimensional 6 terms are absent in C. [Hint: One possible term is d^F^ v d x F pv .] 

(d) Our mythical physicists do not know about the electron, but they are getting excited. They are going 
to start doing photon-photon scattering experiments with a machine called LPC that could produce 
photons with energy greater than m e . Discuss what they will see. Apply unitarity and the Cutkosky 
rules. 

VI 1 1.3.6 Use the effective field theory approach to show that the scattering cross section of light on an electrically 
neutral spin ^ particle (such as the neutron) goes like o oc co 2 to leading order, not co 4 . Argue further that 
the constant of proportionality can be fixed in terms of the magnetic moment /z of the particle. [Historical 
note: This result was first obtained in 1954 by F. Low (Phys. Rev. 96: 1428) and by M. Gell-Mann and 
Murph L. Goldberger (Phys. Rev. 96: 1433) using much more elaborate arguments.] 
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Supersymmetry: A Very Brief Introduction 


Unifying bosons and fermions 

Let me start with a few of the motivations for supersymmetry. (1) All experimentally known 
symmetries relate bosons to bosons and fermions to fermions. We would like to have a 
symmetry, supersymmetry, relating bosons and fermions. (2) It is natural for fermions to 
be massless (recall chapter VII.6), but not for bosons. Perhaps by pairing the Higgs field 
with a fermion field we can resolve the hierarchy problem mentioned in chapter VII.6. 
(3) Recalling from chapter II.5 that fermions contribute negatively to the vacuum energy, 
you might be tempted to speculate that the cosmological constant problem could be solved 
if we could get the fermion contribution to cancel the boson contribution. 

Disappointingly, it has been more than 30 years 1 since the conception of supersymmetry 
(Golfand and Likhtman constructed the first supersymmetric field theory in 1971) and 
direct experimental evidence is still lacking. All existing supersymmetric theories pair 
known bosons with unknown fermions and known fermions with unknown bosons. 
Supersymmetry has to be broken at some mass scale M beyond the regime already explored 
experimentally, but then (as explained in chapter VIII.2) we might expect a cosmological 
constant of order M 4 . 

Be that as it may, supersymmetric field theories have many nice properties (hardly 
surprising since the relevant symmetry is much larger). Supersymmetry has thus attracted 
a multitude of devotees. I give you here as brief an introduction to supersymmetry as I can 
write. In the spirit of a first exposure, I will avoid mentioning any subtleties and caveats, 
hoping that this brief introduction will be helpful to students before they tackle the tomes 
out there. 


1 For a fascinating account of the early history of supersymmetry, see G. Kane and M. Shifman, eds., The 
Supersymmetric World: The Beginning of the Theory. 
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Inventing supersymmetry 

Suppose one day you woke up wanting to invent a field theory with a symmetry relating 
bosons to fermions. The first thing you would need is the same number of fermionic and 
bosonic degrees of freedom. The simplest fermion field is the two-component Weyl spinor 
xfr. You would now have one complex degree of freedom, 2 so you would have to throw in 
a complex scalar field <p . You could proceed by trial and error: Write down a Lagrangian 
including all terms with dimension up to four and then adjust the various parameters in 
the Lagrangian until the desired symmetry appears. For instance, you might adjust fi in 
the mass terms until the theory becomes more symmetrical so 

that the boson and the fermion have the same mass. 

If you were to try to play the game by using a Dirac spinor 'F and a complex scalar 
<p you would be doomed to failure from the very start since there would be twice as 
many fermionic degrees of freedom as bosonic degrees of freedom. I believe that the 
development of supersymmetry was very much retarded by the fact that until the early 
1970s most field theorists, having grown up with Dirac spinors, had little knowledge of 
Weyl spinors. That was a hint that now is the time for you to get thoroughly familiar with 
the dotted and undotted notation of appendix E. To read this chapter, you need to be fluent 
with that notation. 


Supersymmetric algebra 

It is perfectly feasible to construct this supersymmetric field theory, known as the Wess- 
Zumino model, by trial and error, but instead I will show you an elegant but more abstract 
approach known as the superspace and superfield formalism, invented by Salam and 
Strathdee. We will have to develop a considerable amount of formal machinery. Everything 
is very super here. 

Write the supersymmetry generator taking us from <p to i// a as Q a (known as the 
supercharge). The statement that Q a transforms as a Weyl spinor means [J^ v , Q a ] = 
— i(cr llv ) a P Qp, where J IJV denotes the generators of the Lorentz group. Of course, since 
Q a is independent of the spacetime coordinates [P ^‘, Q a ] = 0. From appendix E we denote 
the conjugate of Q a by Q and [J 11 '’, Q a \ = —i(a llv ) 0, p Q?. 

We have to write down the anticommutation relation between the Grassman objects Q a 
and Qp and now the work we did in appendix E really pays off. The supersymmetry algebra 
is given by 

lQ a ,Qp} = 2(^) a pP li (l) 


2 One complex degree of freedom on mass shell and two complex degrees of freedom off mass shell. See the 
superfield formalism below. 
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We argue by the “what else can it be?” method. The right-hand side must carry the indices 
a and $ and we know that the only object that carries these indices is a^. The Lorentz 
index n has to be contracted and the only vector around is . The factor of 2 fixes the 
normalization of Q. 

By the same kind of argument we must have { Q a , Q &} = C\(a llv ) a ^ J^. v + c 2 8%. Com¬ 
muting with P x we see that the constant c\ must vanish. Recalling that Q y — e y pQP , we 
have {Q a , Q y } = c 2 s ya ; but since the left-hand side is symmetric in a and y we have c 2 — 0. 
Thus ,{Q a , Qp} = 0 and [Qa> Qp\ — 0 (see exercise VIII.4.2). 


A basic theorem 


An important physical fact follows immediately from (1). Contracting with (cr v )^“ we 
obtain 

4 P v = (a v f“{Q a ,Qp} (2) 

In particular the time component tells us about the Hamiltonian 

= £{&»> Gal = ElGa. Gil = El2„Gl + GlGa) (3) 

a a a 

We obtain the important theorem that in a supersymmetric field theory any physical state 
| S) must have nonnegative energy: 

<si/fis> = jEEi< 5 'ie ff i5>i 2 >0 (4) 

1 a s' 


Superspace 

Now that we have constructed the supersymmetric algebra let us keep in mind our goal of 
constructing supersymmetric field theories. To do that, we need to figure out and classify 
how fields transform under this supersymmetric algebra. We have to go through a lot of 
formalism, the necessity for which will become clear in due course. 

Imagine that you are trying to invent the superspace formalism. Let us motivate it by 
staring at the basic relation (1) {Q a > Qp) = 2(erA supersymmetric transformation 
Q followed by its conjugate Qj$ generates a translation P IL . Hmm, let’s see, P ^ = / (3 / 3 . t aa ) 
generates translation in x^, so perhaps Q a , being Grassmannian, would generate transla¬ 
tion in some abstract Grassmannian coordinate 0“? (Similarly, Q p would generate trans¬ 
lation in ()P.) 

Salam and Strathdee invented the notion of a superspace with bosonic and fermionic 
coordinates {x /i , 9 a , 6 with the supersymmetry algebra represented by translations in 
this space. 

So let us try Q a and Q g being something like 3 / 30 “ and 3/3 6^, respectively. But then 
{ Q a , Qp} = 0 and we don’t get (1). We have to keep playing around modifying Q a and Qp . 
You may already see what we need. If we add a term such as Oa^d^ to Qp, then the 3/3 0“ 
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in Q a acting on will produce something like the right-hand side of (1). Similarly, 

we will want to add a term such as 0o' ,i 9 /t to Q a . (Once again, the dotted and undotted 
notation we worked hard to develop fixes what we must write, namely {o^) a ^6 a 3^ so that 
the indices match and obey the “southwest to northeast” rule.) Thus, we represent the 
supercharges as 

( 5 ) 

and 

^ = + < 6 > 

You see that (1) is now satisfied. Interestingly, when we translate in the fermionic direction 
we have to translate a bit in the bosonic direction as well. 

Superfield 

A superfield 6 a , 6^), as the name suggests, is just a field living in superspace. An 

infinitesimal supersymmetry transformation takes 

+ + ( 7 ) 

with £ and f two Grassmannian parameters. 

It turns out that we can impose some condition on <J> and restrict this rather broad 
definition a bit. After staring at (5) and (6) for a while, you may realize that there are two 
other objects, 

D a = -J- + 

and 

that we can define, sort of the combinations orthogonal to Q a and Qp. Clearly, D a and 
Dp anticommute with Q a and Qp . The significance of this fact is that if we impose the 
condition Dp<£> — 0 on the superfield <J>, then according to (7) its transform <J>' also satisfies 
the condition. 

A superfield <t> satisfying the condition Dp <l> = 0 is known as a chiral superfield. The con¬ 
dition is actually easy to implement: 3 Observe that if we define y ,L = (jc m + iO a (<J^) a ad a ) 
(note we are adding two bosonic quantities here), then 

= - -^-p+i^{a v )ppd v y^ = -[-W a (^) a p + W^^)pp\ = 0 

Thus, a superfield ^(j, 0) that depends on y and 6 only is a chiral superfield. 

3 This is analogous to the problem of constructing a function fix, y ) satisfying the condition Lf = 0 with 
L = [x (3/9y) — y(d/dx)\. We define r = (x 2 + y 2 ) 2 and observe that Lr = 0. Then any / that only depends on 
r satisfies the desired condition. 
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Let us expand <l> in powers of 9 holding y fixed. Remember that 9 contains two compo¬ 
nents (9 1 , 9 2 ). Thus, we can form an object with at most two powers of 9 , namely 99 , which 
you worked out in exercise E.3. Thus, as usual, power series in Grassmannian variables 
terminate, and we have 

<t>(y, 9) = <p(y) + V29f{y) + 69F{y) 

with ep(y ), ir(y), and F(y) merely coefficients in the series at this stage. We can Taylor 
expand once more around x: 

<t>(y, (9) = <p(x) + sfie^ix) + 99F(x) 

( 8 ) 

+ iOa^dd^cpix) — \9a ll 99a v 6d^d v ep(x) + \/l9i9a^9d ll \lr(x) 

We see that a chiral superfield <t> contains a Weyl fermion field 1 {r , and two complex scalar 
fields ep and F. 


Finding a total divergence 

Let’s do a bit of dimensional analysis for fun and profit. Given that P^ has the dimension 
of mass, which we write as [P^] — 1 using the same notation as in chapter III.2, then (1), 
(5), and (6) tell us that [Q] = [Q] = \ and [ 0 ] = [9\ = — Given [ep] — 1, then (8) tells us 
[V''] — \ > which we know already, and [F] = 2, which we didn’t know. In fact, we have never 
met a Lorentz scalar field with mass dimension 2. How can we have a kinetic energy term 
for F in C with dimension 4? We can’t. The term F 4 F already has dimension 4, and any 
derivative is going to make the dimension even higher. Also, didn’t we say that with q> and 
x[r we balance the same number of bosonic and fermionic degrees of freedom? 

The field F(x) definitely has something strange about him. What is he doing in our 
theory? 

Under an infinitesimal supersymmetric transformation the superfield changes by <$<!> = 
( (s Q + I <2) c t > - Referring to (8), (5), and (6), you can work out how the component fields q>, 
x/s , and F transform (see exercise VIII.4.5). But we can go a long way invoking symmetry 
and dimensional analysis. For example, SF is linear in ^ or |, which by dimensional 
analysis must multiply something with dimension [|] since [F] = 2 and [£] — 

The only thing around with dimension [|] is which carries an undotted index. Note it 

can’t be 3 ^ since $ does not contain 1 Jr. By Lorentz invariance we have to find something 
carrying the index //, and that can only be (<x ll ) a a- The dotted index on (<7 ll ) a a can only be 
contracted with f. So everything is fixed except for an overall constant: 

sf ~ ( 9 ) 

Arguing along the same lines you can easily show that Sep ~ i;ifr and S\j/ ~ £F + d^ipo^. 

The important point here is not the overall constant in (9) but that SF is a total diver¬ 
gence. 

Given any superfield 4> let us denote by the coefficient of 99 in an expansion of <J> 
[as in (8)]. What we have learned is that under a supersymmetric transformation ) 
is a total divergence and thus f d 4 r[<l>] f is invariant under supersymmetry. 



466 | VIII. Gravity and Beyond 

Our next observation is that if Z)^0 = 0, then D^ 2 — 0 also. In other words, if <J> is a 
chiral superfield, then so is O 2 (and by extension, <J> 3 , <l> 4 , and so forth). 


Supersymmetric action 

What do we want to achieve anyway? We want to construct an action invariant under 
supersymmetry. 

Finally, after all this formalism we are ready. In fact, it is almost staring us in the face: 
f d A x[\m<- 1> 2 + jg® 3 + • • •]// is invariant under supersymmetry by virtue of the last two 
paragraphs. Squaring (8) and extracting the coefficient of 66 we see by inspection that 
[O 2 ]^ = (2 Fq> — \jf\jf). Similarly, [<t> 3 ] F = 3 (Fcp 2 — (p^jr^r). Now do exercise VIII.4.6. 

Looks like we have generated a mass term for the Weyl fermion i// and its coupling to 
the scalar field <p, but where are the kinetic energy terms, such as 


Vector superfield 

The kinetic energy terms contain i [r^, which does not appear in O. To get the conjugate 
field i jr^, we obviously have to use <t> 4 , and so we are led to consider More formalism 

here! We call a superfield V(x, 6, 6) a vector superfield if V — V "k For example, 4> 1 4> is a 
vector superfield. 

Imagine expanding O t <I> = (p^cp + ■ ■ ■ or any vector superfield V in powers of 6 and 6. 
The highest power is uniquely 6666 since by the properties of Grassmannian variables the 
only object we can form is 0 1 @ 2 0 i 02- Any object quadratic in 6 and quadratic in 9, such as 
(do^d) (6a ^6) , can be beaten down to 6666 by using the kind of identities you discovered 
in the exercises in appendix E. Let [V] D denote the coefficient of 6666 in the expansion 
of V. 

Again, dimensional analysis can carry us a long way. If V has mass dimension n, 
then [V] D has mass dimension n + 2 since 6 and 6 each has mass dimension — \. Let 
us study how [V\ D changes under an infinitesimal supersymmetry transformation <5V = 
i (<!! Q + l Q)V. We use the same kind of argument as before: <5([V]o) is linear in £ or |, 
which by dimensional analysis must multiply something with dimension n + | since [£] = 
[|] = This can only be the derivative 3 of something with dimension n + |, namely 
the coefficients of 666 and 666 in the expansion of V. We conclude that ^([V]/)) has to 
have the form d^i. ..), namely that <5([V] 0 ) is a total divergence. This is the same type of 
argument that allows us to conclude that is a total divergence. 

Thus, the action f d 4 x[<l> t <I>] D is invariant under supersymmetry. 

Staring at (8), which I repeat for your convenience, 

4>(y, 6) = <p(x) + V2 9 f{x) + 99F(x) 

+ i9cr l *9d ll <p{x) — \9a ,1 99a v 9d il d v (p{x) + Vl9i9a ,l 9d ll ir(x) 


( 10 ) 
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we see that f d 4 x[&Q] D contains f d 4 xtp J( d 2 tp (from multiplying the first term in <Jh with 
the fifth term in <l>), f d 4 xdqAdtp (from multiplying the fourth term in <tb with the fourth 
term in <t>) , f d'^xifrcr^d^ij/ (from multiplying the second term in <f>' with the sixth term 
in <t>), and finally / d 4 xF^F (from multiplying the third term in <£ ' with the third term 
in <t>). It is quite amusing how derivatives of fields arise in supersymmetric field theories: 
Note that the action / d 4 x[<I>hI>] 0 does not contain derivatives explicitly. 

To summarize, given a chiral superfield O we have constructed the supersymmetric 
action 

S = J d 4 x{[<t> t<t] D - ([W(<t>)] F + h.c.)} (11) 

Explicitly, with the choice IV(<t>) = \ mO 2 + |g<I> 3 , we have 

S= f d 4 x{dtp ] dtp + + F ] F — (inFtp — + gFtp 2 — gipij/xlr+ h.c.)} (12) 


An auxiliary field 

From the very beginning the field F seemed strange. Since [F] = 2 we anticipated that it 
cannot have a kinetic energy term with mass dimension 4 and indeed it doesn’t. We see that 
it is not a dynamical field that propagates—it is an auxiliary field (just like a in chapter III.5 
and^ ( in chapter VI.3) and can be integrated out in the path integral f DF' [ DFe' s . Indeed, 
collect the terms that depend on F in S, namely 

F' F — F(mtp + gtp 2 ) — F 1 (imp ' + gtp }2 ) = \F — (imp + gtp 2 ) 1 1 2 — |imp + gtp 2 \ 2 
So, integrate over F and F 1 ' and get 

S = J d 4 x{dtp^dtp + ix/ra^d^xlr — \tntp + gtp 2 \ 2 + (jinx/n/r — gtpx/nj/+h.c.)} (13) 

Note that the scalar potential V(tp Jf , tp) = \imp + gtp 2 \ 2 > 0 in accordance with (4) and 
vanishes at its minimum, giving a zero cosmological constant. Note that we are no longer 
free to add an arbitrary constant to V (tp*, tp) as we could in a nonsupersymmetric field 
theory. 

As expected, supersymmetric field theories are much more restrictive than ordinary 
field theories, and, duh, also much more symmetric. The formalism described here can 
be extended to construct supersymmetric Yang-Mills theory. 

Another important generalization is to introduce, instead of one supercharge Q a , Af su¬ 
percharges Q‘, with I = 1, • • •, M (exercise VIII.4.2). Since each charge Q a transforms 
like the S z — \ component of a spin 5 operator, it takes a state with S z — m in a super- 
multiplet to a state with S z = m + j. Thus the integer A f is bounded from above. For 
supersymmetric Yang-Mills theory, the maximum number of supersymmetry generators 
is Af — 4 if ^we do not want to introduce fields with spin > 1. Similarly, the most supersym¬ 
metric supergravity theory we could construct (exercise VIII.4.3) has Af — 8. 
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As mentioned in chapter VII.3, if any nontrivial 4-dimensional quantum field theory 
turned out to be exactly soluble, the supersymmetric Af — 4 Yang-Mills theory is probably 
our best bet. In all likelihood, the first relativistic quantum field theory to be solved exactly 
would be Af — 4 Yang-Mills in the planar large N limit of chapter VII.4. 

I hope that this brief introduction gave you a flavor of supersymmetry and will enable 
you to go on to specialized treatises. 


Exercises 


VI1 1 . 4.1 Construct the Wess-Zumino Lagrangian by the trial and error approach. 

Vlll. 4.2 In general there may be A f supercharges Q l a , with 1 = 1,..., Af. Show that we can have { Q l a , Q = 
s a pZ IJ , where Z IJ denotes onumbers known as central charges. 

VI1 1 . 4.3 From the fact that we do not know how to write consistent quantum field theories with fields having 
spin greater than 2 show that the Af in the previous exercise cannot exceed 8 . Theories with Af = 8 
supersymmetry are said to be maximally supersymmetric. Show that if we do not want to include gravity, 
Af cannot be greater than 4. Supersymmetric Af = 4 Yang-Mills theory has many remarkable properties. 

VI1 1 . 4.4 Show that dO a /dO^ = s a p. 

VI1 1 . 4.5 Work out 8(p, 8\js, and 8F precisely by computing 8<& = i(£ a Q a + ZaQ 01 )®- 

VI1 1 . 4.6 For any polynomial show that [W(<3>)]/r = F[dW{(p)/d(p] + terms not involving F. Show that for 

the theory (11) the potential energy is given by V(gA, (p) = \dW {(p) / d(p\ 2 . 

VI1 1 . 4.7 Construct a field theory in which supersymmetry is spontaneously broken. [Hint: You need at least three 
chiral superfields.] 

Vlll. 4.8 If we can construct supersymmetric quantum field theory, surely we can construct supersymmetric 
quantum mechanics. Indeed, consider Q 1 = \\o\P + a 2 W(x)] and Q 2 = ~ where the 

momentum operator P = —i ( d/dx ) as usual. Define Q = Q\-\- i 02 - Study the properties of the Hamil¬ 
tonian H defined by {Q, Q^} = 2H. 



XXIII r A Glimpse of String Theory 

V I I I as a 2-Dimensional Field Theory 


Geometrical action for the bosonic string 


In this chapter, I will try to give you a tiny glimpse into string theory. Needless to say, you 
can get only the merest whiff of the subject here, but fortunately excellent texts do exist and 
I believe that this book has prepared you for them. My main purpose is to show you that 
perhaps surprisingly the basic formulation of string theory is naturally phrased in terms 
of a 2-dimensional field theory. 

In chapter 1.11 I described a point particle tracing out a world line given by X^{r) in 
D— dimensional spacetime. Recall that the action is given geometrically by the length of 
the world line 


s—,"J 


dr 


dX^dX „ 
dr dr 


( 1 ) 


and remains unchanged under reparametrization r —*■ r'{r). Recall also that classically, S 
is equivalent to 


v. 

‘-'imp 


1 

2 



/ 1 dX* dX fl 
V y dr dr 


+ ym‘ 


( 2 ) 


Now consider a string sweeping out a world sheet given by X M (r, o) in D -dimensional 
spacetime, which we have already encountered in chapter IV.4 in connection with differ¬ 
ential forms. In analogy with (1), Nambu and Goto proposed an action given geometrically 
by the area of the world sheet 


Sng = T J drda^JdetO.X^X^ (3) 

where d\X^ = dX^/dt, 9 2 .X' Al = dX^/do, and (d^^d^X^) denotes the ab element of a 2 by 
2 matrix. Here, as in (1), fi ranges over D values: 0, 1, . . ., D — 1. The constant T (= l/lna' 
with a' the slope of the Regge trajectory in particle phenomenology) corresponds to the 
string tension since stretching the string to enlarge the world sheet costs an extra amount 
of action proportional to T. 
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In a precise parallel with the discussion for the point particle, it is preferable to avoid 
the square root and instead use the action 

S=\T J dTd<ryiy ab (.d a X' l d b X ll ) (4) 

with y = det y ab in the path integral to quantize the string. We will now show that S is 
equivalent classically to S NG . 

As in (2), we vary S with respect to the auxiliary variable y ah , which we then eliminate. 
For a matrix M, SM -1 = and S det M — 5e trlogiW = e trlo § M trM~ 1 (5M = 

(det M)\xM~ 1 8M. Thus, 8y ab — —y ac 8y cd y db and 8y = yy ba 8y ab . For ease of writing, 
define h ab = The variation of the integrand in (4) thus gives 

s[yiy ab h ab ] = yhy c 8y cd (y ab h ab ) - y ac 8y cd y db h ab ] 

Setting the coefficient of 8y cd equal to 0 we obtain 

h cd= jYcd(Y ab h a b ) ( 5 ) 

where the indices on h are raised and lowered by the metric y . Multiplying (5) by h dc (and 
summing over repeated indices) we find y ab h ab — 2 and thus y cd = h cd . Plugging this into 
(4) we find that S=T f drderfdet /z)2. Thus, S and S NG are indeed equivalent classically. 
The action (4), first discovered by Brink, Di Vecchia, and Howe and by Deser and Zumino, 
is known as the Polyakov action. 

Note that (5) determines y ab only up to an arbitrary local rescaling known as a Weyl 
transformation: 

Yabi*’ 0 )^ e^’^Yabi *- 0 0 ( 6 ) 

Thus, the action (4) must be invariant under the Weyl transformation. 

Staring at the string action (4), you will recognize that it is just the action for a quantum 
field theory of D massless scalar fields X^(r, a) in 2-dimensional spacetime with coor¬ 
dinates (t, a), albeit with some unusual signs. The index /x plays the role of an internal 
index, and Poincare invariance in our original D-dimensional spacetime now appears as 
an internal symmetry. Indeed, a good deal of string theory is devoted to the study of quan¬ 
tum field theories in 2-dimensional spacetime! It is amusing how quantum field theory 
manages to stay on the stage. 

To this bosonic string theory we can add fermionic variables in such a way as to make 
the action supersymmetric. The result, as you surely have heard, is superstring theory, 
thought by some to be the theory of everything. 1 


1 “To understand macroscopic properties of matter based on understanding these microscopic laws is just 
unrealistic. Even though the microscopic laws are, in a strict sense, controlling what happens at the larger scale, 
they are not the right way to understand that. And that is why this phrase, “theory of everything,” sounds sleazy.”— 
J. Schwarz, one of the founders of string theory. 
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This infinitesimal introduction to string theory is all I can give you here, but I hope that 
this book has prepared you adequately to begin studying various specialized texts on string 
theory. 2 


2 For a brief but authoritative introduction, see E. Witten, “Reflections on the Fate of Spacetime,” Physics Today, 
April 1996, p. 24. 
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Closing Words 


As I confessed in the preface, I started out intending to write a concise introduction to 
quantum field theory, but the book grew and grew. The subject is simply too rich. As I 
mentioned, after a period of almost being abandoned, quantum field theory came roaring 
back. To quote my thesis advisor Sidney Coleman, the triumph of quantum field theory 
was veritably “a victory parade” that made “the spectator gasp with awe and laugh with 
joy.” 

String theory is beautiful and marvellous, but until it is verified, quantum field theory 
remains the true theory of everything. All of physics can now be said to be derivable from 
field theory. To start with, quantum field theory contains quantum mechanics as a (0 + 1)- 
dimensional field theory, and to end (perhaps) with, string theory may be formulated as a 
(1 + 1)-dimensional field theory. 

Quantum field theory can arguably be regarded as the pinnacle of human thought. 
(Hush, you hear the distant howls of the mathematicians, English professors, philoso¬ 
phers, and perhaps even a few stuck-up musicologists?) It is a distillation of basic notions 
from the very beginning of the physics: Newton’s realization that energy is the square of 
momentum appears in field theory as the two powers of spatial derivative. But yet—you 
knew that was coming, didn’t you, with field theory set up as the pinnacle et cetera?— 
but yet, field theory in its present form is in my opinion still incomplete and surely some 
bright young minds will see how to develop it further. 

For one thing, field theory has not progressed much beyond the harmonic paradigm, as 
I presaged in the first chapter. The discovery of the soliton and instanton opened up a new 
vista, showing in no uncertain terms that Feynman diagrams ain’t everything, contrary 
to what some field theorists thought. Duality offers one way of linking perturbative weak 
coupling theory to strong coupling, but as yet practically nothing is known of the strong 
coupling regime. When speaking of renormalization groups, we bravely speak of flowing 
to a strong coupling fixed point, but we merely have the boat ticket: We have little idea of 
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what the destination looks like. Perhaps in the not too distant future, lattice field theorists 
can extract the field configurations that dominate. 

Another restriction is to two powers of the derivative, a restriction going back to Newton 
as I remarked above. In modern applications of field theory to problems far beyond particle 
physics, there is no reason at all to impose this restriction. For example, in studying visual 
perception, one encounters field theories much more involved than those we have studied 
in this book. (See the appendix for a brief description.) These field theories are Euclidean 
in any case and the corresponding functional integral with higher derivatives certainly 
makes sense: It is only in Minkowskian theories that we do not know how to handle 
higher derivatives. Newton again—certainly economists consider the rate of change of 
the acceleration as well as acceleration. Another innovative application is the formulation 1 
of a class of problems in nonequilibrium statistical mechanics as field theories. Typically, 
various objects wander around and react when they meet. This class of problems appears 
in areas ranging from chemical reactions to population biology. 

We can go far beyond the restriction on the number of derivatives in the Lagrangian. 
Who said that we can only have integrands of the form “exponential of a spacetime 
integral” ? Most modifications you can think of might immediately run afoul of some basic 
principles (for example, / Dipe~ f d f d x '^)\ wou i(£ violate locality), but surely 

others might not. Another speculative thought I like to entertain goes along the following 
line: Classical and quantum physics are formulated in terms of differential equations 
and functional integrals, respectively. But how are differential equations contained in 
integrals? The answer is that the integrals / D(pe~^^ f d xCi ' ip) contain a parameter h so 
that in the limit H going to zero the evaluation of the integrals amounts to solving partial 
differential equations. Can we go beyond quantum field theory by finding a mathematical 
operation that in the limit of some parameter k going to zero reduces to doing the integral 
/ Dcpe~ m) f d * xC{<p) } 

The arena of local field theory has always been restricted to the set of d real numbers 
The recent excitement over noncommutative field theory promises to take us beyond. 
(I was tempted to discuss noncommutative field theory too, but then the nutshell would 
truly burst.) 

But perhaps the most unsatisfying feature of field theory is the present formulation of 
gauge theories. Gauge “symmetry” does not relate two different physical states, but two 
descriptions of the same physical state. We have this strange language full of redundancy 
we can’t live without. We start with unneeded baggage that we then gauge-fix away. We even 
know how to avoid this redundancy from the start but at the price of discretizing spacetime. 
This redundancy of description is particularly glaring in the manufactured gauge theories 
now fashionable in condensed matter physics, in which the gauge symmetry is not there 
to begin with. Also, surely the way we calculate in nonabelian gauge theories by cutting 
the Yang-Mills action up into pieces and doing violence to gauge invariance will be held 


1 By M. Doi, L. Peliti, J. C. Cardy, and others. See for example J. C. Cardy, cond-mat/9607163, “Renormalisation 
Group Approach to Reaction-Diffusion Problems,” in: J.-B. Zuber, ed., Mathematical Beauty of Physics, p. 113. 
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up to ridicule a hundred years from now. I would not be surprised if a brilliant reader of 
this book finds a more elegant formulation of what we now call gauge theories. 

Look at the development of the very first field theory, namely Maxwell’s theory of 
electromagnetism. By the end of the nineteenth century it had been thoroughly studied and 
the overwhelming consensus was that at least the mathematical structure was completely 
understood. Yet the big news of the early twentieth century was that the theory, surprise 
surprise, contains two hidden symmetries, Lorentz invariance and gauge invariance: two 
symmetries that, as we now know, literally hold the key to the secrets of the universe. Might 
not our present day theory also contain some unknown hidden symmetries, symmetries 
even more lovely than Lorentz and gauge invariance? I think that most physicists would 
say that the nineteenth-century greats missed these two crucial symmetries because of 
their lousy notation 2 and tendency to use equations of motion instead of the action. Some 
of these same people would doubt that we could significantly improve our notation and 
formalism, but the dotted-undotted notation looks clunky to me and I have a nagging 
feeling 3 that a more powerful formalism will one day replace the path integral formalism. 

Since the point of good pedagogy is to make things look easy, students sometimes do 
not fully appreciate that symmetries do not literally leap out at you. If someone had written 
a supersymmetric Yang-Mills theory in the mid-1950s, it would certainly have been a long 
time before people realized that it contained a hidden symmetry. So it is entirely possible 
that an insightful reader could find a hitherto unknown symmetry hidden in our well- 
studied field theories. 

It is not just a matter of clearer notation and formalism that caused the nineteenth- 
century greats to miss two important symmetries; it is also that they did not possess 
the mind set for symmetry. The old paradigm “experiments —>• action -» symmetry” had 
to be replaced 4 in fundamental physics by the new paradigm “symmetry —> action -> 
experiments,” the new paradigm being typified by grand unified theory and later by string 
theory. Surely, some future physicists will remark archly that we of the early twenty-first 
century did not possess the right mind set. 

In physics textbooks, many subjects have a finished completed feel to them, but not 
quantum field theory. Some people say to me, what else is there to say about field theory? 
I would like to remind those people that a large portion of the material in this book was 
unknown 30 years ago. Of course, while I feel that further developments are possible, I 
have no idea what—otherwise I would have published it—so I can’t tell you what. But let me 
mention two recent developments that I find extremely intriguing. (1) Some field theories 
may be dual to string theories. (2) In dimensional deconstruction a d-dimensional field 
theory may look (d + l)-dimensional in some range of the energy scale: the field theory 
can literally generate a spatial dimension. These developments suggest that quantum field 


2 It is said, and I agree, that one of Einstein’s great contributions is the repeated indices summation convention. 
Try to read Maxwell’s treatises and you will appreciate the importance of good notation. 

3 1 once asked Feynman how he would solve the finite square well using the path integral. 

4 A. Zee, Fearful Symmetry, chap. 6. 
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theories contain considerable hidden structures waiting to be uncovered. Perhaps another 
golden age is in store for quantum field theory. 

So boys and girls, the parade is over, and now it’s up to you to get another parade going. 5 


Appendix 


An image presented to the visual system can be described as a 2-dimensional Euclidean field <p 0 (x), with (p 0 
representing the gray scale from black (cp 0 = — oo) to white (<p 0 = +oo). [You can see that color might be included 
by going to a field (p transforming under some internal S O (2) group for example.] The image actually perceived, 
(p{x), is the actual image (po(x ) distorted to (pQ [y(x)] plus some noise r](x). Distortion is described by a map 
x —> y(x ) of the 2-dimensional Euclidean plane. Your brain's task is to decide whether the actual image is <Pq(x) 
or some other (p\{x). Your ability to discriminate between images depends on the functional integral 

z = / Dy(x) J Drj(x')e~ WlyM] - (1/2C) f dx, ’ M 8{<p 0 [y(x)]+ri(x)-<p(x)} (1) 

= J Dy{x)e~ W[y '' x)] ~ (V2C) / rfh^M-PobW]} 2 

where for simplicity I have taken the noise, measured by the parameter C, to be Gaussian and white. The 
weighting function VF[y(x)] is presumably hard wired by evolution into our visual system, telling us that certain 
distortions (translations, rotations, and dilations) are much more likely than others. Writing y(x) = x + A(x) we 
note that Z defines a field theory of the 2-component field A t (x), which can always be written as A,- = r] + £(j dj x • 
Note that the field A, (x) appears “inside” an “external” field <Pq. From symmetry considerations we might argue 
that 

w = - j d 2 x + yyx3 6 x^) 

with two coupling constants / and g. I can give here only the briefest of sketches and refer the interested reader 
to the literature. 6 Clearly, one can think of other examples. This particular example serves only to show that there 
are many more field theories than those described in standard texts. 


5 As the Beatles said, quantum fields forever! 

6 W. Bialek and A. Zee, Statistical mechanics and invariant perception, Phys. Rev. Lett. 58: 741, 1987; Under¬ 
standing the efficiency of human perception, Phys. Rev. Lett. 61: 1512,1988. 



Part N 


While quantum field theory was discovered and developed in the twentieth century, I will 
introduce in this part, added in the second edition of this text, some topics that have been 
worked out in the twenty-first century. At the rate these topics are rapidly evolving, I may 
be quite foolish to include them here. But I am talcing the plunge as I think that I would 
serve my readers better by letting them have a taste of the twenty-first century rather 
than expanding on the twentieth. More likely than not, by the time this second edition 
is published, there will be better ways of treating the material contained here. You should 
read part N in this spirit and regard what is given here as an entry key to a fast growing 1 
research literature. 


1 Indeed, by the time the manuscript was copyedited (April 2009) it had been discovered that the amplitudes 
discussed in chapters N.2-4 could be written even more simply using a twistor and dual twistor formalism. See 
p. 494 and N. Arkani-Hamed, P. Cachazo, C. Cheung, and J. Kaplan, arXiv:09032110. 
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Gravitational Waves and Effective Field Theory 


An unfinished symphony 

One astounding prediction of Einstein gravity is the existence of ripples crisscrossing the 
fabric of spacetime, what one writer refers to as Einstein’s unfinished symphony. 1 Massive 
detectors have been built, with more to come, in a human “curious George” effort to tune 
in to the song of the cosmos. 

Consider a black hole of size r s — 2 Gm (its Schwarzschild radius—see chapter 1.10, with 
m its mass) a distance r 0 from another black hole, moving with velocity v. As the black 
holes spiral into each other they emit gravitational waves, with a characteristic wavelength 
determined by the orbital period X — 2jtr 0 /v. Thus the physics contains three distance 
scales: r s , r 0 , and X. We will stay within the simple post Newtonian regime r s « X. 
Toward the end, as r 0 ~ and v — 1, relativistic effects rear their nasty heads, from which 
we will prudently stay away. 

In the Closing Words to the first edition of this text, I mention that one intriguing 
development over the last few decades has been the use of effective field theory to describe 
situations involving more than one energy scale (or equivalently, length and time scales). 
The physics at the high energy scale M is then represented in the low energy effective 
Lagrangian by higher dimensional terms, suppressed by powers of M but constrained by 
the symmetries we know. Examples abound in this book, from the quantum Hall effect 
to surface growth to proton decay. The latter provides a classic example: while we profess 
ignorance of the physics responsible for proton decay, we can nevertheless make useful 
predictions by adding 4-fermion interactions invariant under the low energy gauge group 
SU (3) ® SU{2) ® U (1), as shown in chapter VIII.3. 

An interesting recent development is an elegant description of the emission of gravi¬ 
tational waves by inspiraling black holes using effective field theory. Here, in contrast to 
proton decay, we actually know the short distance physics involved. Effective field theory 


1 M. Bartusiak, Einstein's Unfinished Symphony. 
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nevertheless offers an efficient and sensible way to organize and compartmentalize physics 
on the various distance scales. We will merely touch upon one aspect of this approach. 


Finite size objects in general relativity 


Since r s <^r 0 , the leading approximation would be to treat the black hole as a point particle 
using the action (1.11.12) 5 pp = — m f dr — —m f g pv dX^dX v — —m f drJg^X^X", 
where X^ = dX^/dr. Let us now include the corrections due to the finite size of the black 
hole. As you will see, the following discussion actually applies not only to black holes but 
to any finite sized object, including you. 

In the spirit of effective field theory, we add to S pp higher dimensional terms to be formed 
out of the point particle degree of freedom X 1 * and the ambient g the particle moves in, 
subject to local coordinate invariance, of course. The invariant tensors we can form out of 
g pv are, to leading order, the scalar curvature R, the Ricci curvature R /JV , and the Riemann 
curvature tensor X /jAup . You might start with the scalar curvature and the Ricci curvature, 
and add to ,S pp the terms S^rop = / dr(c s R(X) + c R R pv (X)X^X v ). The curvatures R(X) 
and R pv {X) are evaluated on the worldline X^(t) of the particle of course. 

Einstein’s equation of motion R^ v — jg^'R — 0 implies R pv (X) = 0 and thus also 
R(X) = 0. Following the discussion in chapter VIII.3, we now show that, as we might 
intuitively feel, we are allowed to drop S(j rop . For our problem we have the total action 
S = S EH + S pp + S^op with the Einstein-Hilbert action (VIII.1.1) S EH = f d A x«J—gM 2 p R. 
Under a field redefinition g^ —»■ g + Sg pv , 

SS EH = I dW^gM 2 p (R» v - ±g* v R)8 gllv ( 1 ) 

Note that was how we would have derived the equation of motion for gravity, by varying g /lv . 
Here we are not making an arbitrary variation, but rather our goal is to choose a specific 
Sg /lv so that the resulting <55^ negates S^op. Since S^rop consists of an integral over the 
worldline of the particle while <5 S E h is given by an integral over spacetime, we need a delta 
function in Sg to switch from one kind of integral to another. The choice 

&g» v (x) = , * . [ drS\x - X(T))[a 8llv (X) + bX p X v ] (2) 

y /-g(x)M 1 p J 


gives 

SS = J dr(-aR(X) + b 


VW - j S^(X) 


X^ l X v ) 


(3) 


So, with some appropriate values of a and b, we can indeed cancel off 2 5(j rop , thus 
vindicating our intuition that the particle does not feel the Ricci and scalar curvatures 
for the obvious reason that they vanish. 


2 A technicality: field redefinition also induces a contact interaction of the form f dx(lf*J— g)5 4 (X 1 (r) — 
X 2 (r)) between the two massive objects. Going from field theory to the point particle description represents a 
conceptual step backward, so that we should expect delta function effects at the location of the point particles. 
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How about terms we can construct out of the Riemann curvature tensor R p xv P ? For your 
convenience, I list its symmetry properties 3 here: R zpflv = -R zpvp = ~R pzpv , R zppv = 
R pvzp , and /? + R zp , vp + R zvpp , = 0- Thus, due to the antisymmetry, we are not able to 

contract all four indices of R p \ vp with X 11 . We can contract at most two indices, to form 
the two objects E pv (X) = R pXvp (X)X k X p and B pv (X ) = R llkvp (X)X k X p , where 

R p iv P (x) = j^=s lll an R,J \ p (x) 

denotes the dual of the curvature tensor. These are 2-indexed tensors and we need to square 
them to form scalars to put into the action. Hence, to next order the particle action becomes 

S P = J dz(-m + c E E pv E pv + c B B pv B» v + • • •) (4) 

Note that the unknown constants c B and c B have dimension of inverse mass cubed. 

Before we explore the physical content of this effective action, let us understand the 
meaning of E and B by retreating to the more familiar case of a point particle mov¬ 
ing in an electromagnetic field F pv (in flat space). Form E fl = F pv X v and B p = F pv X v , 
where F pv (x) = jS^a^F 0 ' 1 . Going to the rest frame of the particle, where X° = 1 and 
X 1 = 0, we see that, as the notation suggests, this is just the familiar decomposition of 
electromagnetism into electricity and magnetism. Similarly, E pv and B pv represent the 
decomposition of curvature into its “electric” and “magnetic” components. 

In chapter 1.11 we varied the first term in (4) to obtain the standard geodesic equation 
that is at the heart of Einstein’s theory. Here we obtain 


d 2 X p 

dr 2 


+ r£ v (X(r)) 


dX p dX v 
dr dr 


f p (X (D) 


(5) 


where f p (X( r)) comes from varying the E and B terms in (4). A finite sized body 
experiences a tidal force due to the varying gravitational force acting on it. It no longer 
follows a geodesic. 

The fact that we had to square the electric and magnetic components of the curvature 
to form the effective action (4) means that the effects of these correction terms are highly 
suppressed. Since Riemann curvature contains two derivatives, the correction terms in¬ 
volve four derivatives. To estimate the magnitude of c E and c B we exploit a rather cute 
argument as follows. 

Consider the scattering of a graviton off this point particle (which, remember, is a 
black hole in the problem we are studying) generated by the couplings in (4): iJvi ~ 
• • • + ;'c B B a> 4 /Mp + • • ■ where u> denotes the energy of the graviton. The powers of to 
follows from the four derivatives just mentioned. (If you don’t understand the powers 
of M P you need to read chapter VIII.1 again.) Here c B B denotes the two unknown 
couplings c B ~ c B generically. Imagine calculating the total scattering cross section for 
a graviton on a black hole. Squaring the amplitude A4 etc., we would end up with a ( to ) ~ 

' ' ' + C E,B &|8 /^/> + • • •• 


3 S. Weinberg, Gravitation and Cosmology, p. 141. 
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This treatment of the black hole as a point particle is only valid for ojr s <£ 1 of course. The 
(• • •) in ijvt represents diagrams we have not included, for example, the one originating 
from the first term in (4) (namely the term responsible for keeping us down to earth!). A 
nice feature of the argument I am about to give is that we don’t even need to know what 
the terms in (• • •) are. 

On the other hand, we argue that by dimensional analysis the cross section must have 
the form ct(o>) = rjf(a>r s ) since the only length scale in the Schwarzschild metric is r s . 
Expanding the unknown function f(cor s ) in powers of its argument we have cr(co) = 
■ ■ ■ + aa> 8 r^,° + • ■ ■ with a some constant. (A technical aside: the massless graviton could 
produce infrared factors like log ct>r s , which we ignore for our purposes.) 

Requiring that the two expressions agree, we obtain c EB ~ Mj,ry Indeed, as expected, 
the couplings c E B are highly suppressed as r s -> 0. 


Exercise 

N.i.i Using considerations similar to those in the text, show that the scattering cross section for a photon of 
frequency co on an atom or a molecule vanishes like co A as a) —> 0, a result which, as mentioned in chapter 
VIII.3, underlies the well-known explanation of why the sky is blue. 




Gluon Scattering in Pure Yang-Mills Theory 


Boil and toil with Feynman diagrams 

You might think that after some 50 years, there could not possibly be any novelty in 
calculating Feynman amplitudes. But you would be wrong. Over the last dozen years or so, 
and largely since the first edition of this text, a group of intrepid searchers have found some 
amazingly powerful methods of tackling Feynman diagrams. As I said at the beginning of 
chapters VIII.4 and 5, I can only give you an introduction to this subject, telling you just 
enough for you to explore this fast-growing literature. 

To best appreciate this new development, you should do a little calculation before reading 
further. Consider pure Yang-Mills theory, by consensus the nicest field theory we have, 
simple to write down and perfumed with symmetries. Not even any fermions around to 
mess things up. Call the gauge bosons gluons for convenience. Now calculate 5-gluon 
scattering at tree level as shown in figure N.2.1. No loops, just trees. The Feynman rules 
are given in chapter VII. 1 and also in appendix C. 

You really must calculate before reading on. I will wait for you. You think to yourself, 
this is easy, just a bunch of tree diagrams. In fact, to make it easier, put all the external 
gluons on-shell, that is, set pf = 0, i = 1, 2, • • •, 5. 

This calculation is not merely an idle exercise, but is in fact phenomenologically im¬ 
portant. At an accelerator such as the soon-to-be-operational Large Hadron Collider, two 



+... 


Figure N.2.1 
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Result of a brute force calculation (actually only a small part of it): 
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Figure N.2.2 


protons are smashed together at high energies. Two gluons, one from each proton, collide 
and produce three gluons, which then materialize into three jets of hadrons. Because of 
asymptotic freedom, at high energies the effective coupling g becomes small enough for 
perturbative field theory to be relevant, and the tree amplitude you are busily calculating 
provides a key ingredient for the phenomenological models used to study the experimental 
measurements. 

Time’s up! A small part of the answer is shown in figure N.2.2, taken from a lecture by 
Zvi Bern . 1 You really should take a look in order to appreciate, to be grateful for even, the 
formalism to be explained in this chapter. You know that the amplitude is linear in each 
of the five polarization vectors e,-. The 3-gluon vertex (VII.1.11) is linear in momentum 
and there are three of them in a typical diagram. Thus a typical term in the numerator 
of the Feynman amplitude would be, as shown in the figure, pi ■ p A e 2 ■ p\e\ ■ ^£4 • 65 . A 
rough estimate shows that there are almost 10,000 such terms. That’s why, in spite of my 
admonition, you didn’t finish the calculation before reading ahead. Incidentally, you could 
see that even 4-gluon scattering at tree level, though doable by hand, is rather involved. 


1 Z. Bern, “ 
Bernl.pdf. 


Magic Tricks for Scattering Amplitudes,” http://online.itp.ucsb.edu/online/colloq/bernl/pdf/ 
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New technology for Feynman diagrams 


In practice, phenomenologists studying jet production have developed elaborate computer 
codes based on numerical recursion and these prove to be quite efficient. In this introduc¬ 
tory text, however, we are not after numerical efficiency but a deeper understanding of the 
structure of multi-gluon amplitudes. I have set you up so that, surely, after your abortive 
attempt to calculate the 5-gluon amplitude you now fully appreciate the need for new ways 
of approaching Feynman diagrams. I will now explain some of the novel methods people 
have invented over the last 15 years or so. 

A relatively simple first step is to strip the color off the amplitude. Evidently, it is much 
better to use the matrix notation of (IV.5.16) than the index notation of (IV.5.17). Instead 
of the structure constants f abc and their products in the “indexed” Feynman rules in 
(VII.1.11-13) we have colored Feynman rules (read appendix 1 now) with objects like 
tr(T a [T b , r c ]) and t[(\T a , T b ][T c , T d ]) for the cubic and quartic coupling vertex, respec¬ 
tively, where T a denotes the matrix representing the suitably normalized generators of the 
gauge group. Denote the color matrices carried by the external gluons by T° l , T a2 , ■ ■ ■ T° n 
(with n = 5 in the example you failed to do). 

In calculating a multi-gluon scattering amplitude in tree approximation, you would find 
each term multiplied by the product of a bunch of color traces, such as ti(T e A)ti(T e B) 
where A and B denote products of 7”s. Here the index e is carried by a virtual gluon and 
hence is summed over. We now use the group theoretic identity (with e summed over) for 
the gauge group SU ( N ) [recall (IV.5.19)] 


(T e y.(T e )f= - air:— s'.s? 

v ’1 1 W j N 1 1 


( 1 ) 


(Here e = 1, • • •, N 2 — 1 and the indices i, j , k, l = 1, • • •, N, of course.) The second term 
takes care of the traceless condition tr T e = 0. However, we can drop it, since if we extend 
the gauge group to U (N), that extra gluon does not couple to the other gluons anyway. Thus 
ti(T e A)ti(T e B) — Jtr (AB). Repeating this procedure, we reduce the product of traces to 
a single trace of n T a ’s multiplied together in some specific order. 

Indeed, the astute reader will have noted that had we used the double-line formalism of 
figure IV.5.2, this entire discussion would not even have been necessary. As we also saw 
in chapter VII.4, the double-line formalism does offer many advantages. 

The other simplifying step is to specify the helicity of the gluon instead of writing 
the amplitude in terms of polarization vectors. You recall, from way back when, that a 
massless spin 1 particle moving along the third-direction k — a>( 1 , 0 , 0 , 1 ) can have helicity 
h — +, corresponding to the polarization vector e = 1/(-s/2)( 0 , 1, i, 0 ), or helicity h — —, 
corresponding to the polarization vector e = (l/\/2)(0, 1, —i, 0). We specify the external 
gluons by momentum, helicity, and color: (p 1 , /j 1( a 1( P 2 , a 2 , • ■ ■ , p n , h n , a n ). 

Thus we can write the n -gluon amplitude as 


M = i J2 K(T ai T“ 2 ■ ■ ■ T a ")A( 1, 2, • • •, n) 

permutations 


( 2 ) 
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Following the literature, we have compressed the notation further and denote { p t , li ,■} by i. 
The sum is over all possible permutations of the n gluons. We can now focus on the “color 
stripped” amplitude A(l,2, •••,«). 

First, a triviality. It is convenient to treat the gluons as all outgoing (or if you prefer, as 
all incoming), so that JT p, — 0 and the time component of some of the momenta can be 
negative. We can then obtain the physically desired amplitude by crossing. Keep in mind 
that under crossing p -> —p and e -> e*, that is, the helicity flips. 


The spinor helicity formalism 


Now we are ready to return to the expression in figure N.2.2. The technical term for this 
expression is an “unholy mess.” It turns out that the key to unraveling this hopeless morass 
can be found in exercise II.3.1: that the Lorentz vector sits in the representation ( j,j) and 
thus can be constructed as a product of two spinors, one from the representation (j, 0), 
the other from (0, You did the exercise, didn’t you? So you know how to write, for 
example, the momentum vector p 11 as a product of two spinors. To go on, you should also 
read appendices B and E. 

I am now ready to explain the spinor helicity formalism designed to exploit this peculiar 
property of the Lorentz vector. Or, to say it a bit more mysteriously, I am going to show 
you how to take the square root of the momentum. 

Now you appreciate the power of the undotted-dotted notation introduced in appendix E. 
The undotted index goes with (j, 0), and the dotted with (0, \). We are looking for an object 
transforming like , 0) ® (0, \) to represent a vector. The problem can then be stated as 
follows: instead of writing momentum as we want to write it as p a an object carrying 
an undotted and a dotted index, namely a 2 by 2 matrix in cruder language. 

We merely have to flip through appendix E and look for an object carrying the desired 
indices. There it is, and indeed, its p index is begging to be contracted with p^. 

Thus, with no further work, we can write [since cr^ 1 = (I, a)] 


Paa — Pfi )aa (P 1 P ® )a« 


(P°-P 3 ) ~(p l ~ip 2 )\ 

-(p' + ip 2 ) (p° + p 3 ) / 


(3) 


We have succeeded in writing the momentum as a 2 by 2 matrix. You may recognize this 
as nothing but the matrix X M (with some trivial change in notation) used in appendix B 
to construct the covering of 50(3, 1) by SL( 2, C). 

Given two vectors p and q, their scalar product is given by 


p ■ q = ^^“^Pauqpp 


(4) 


which you can check explicitly, writing the right-hand side as a trace and once again using 
a 2 aj = —ai(T 2 , as we did in appendix E. For q — p, this reduces to p ■ p = s a ^e a ^PaaPpp — 
det p; here we recognized a definition of the determinant. [Of course, you could also eval¬ 
uate the determinant of (3) by inspection, or recall that this was also used in appendix B.] 
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Clearly, there is an unavoidable notational overload: the single letter p denotes both the 
vector and the matrix, but you should be able to tell from the context which object is being 
referred to. 

Here we are going to apply this formalism to massless gluons with lightlike momenta. 
Things simplify considerably: for p lightlike, det p — 0 and thus the matrix p generically 
has one 0 eigenvalue. (In fancy talk, the matrix has rank 1 rather than 2.) From elementary 
linear algebra we recall that a 2 by 2 matrix m of rank 1 can always be written as m t ,• = v, w , 
with v and w two 2-component vectors, (obviously since the vector orthogonal to w provides 
the 0 eigenvector.) Thus, for a lightlike vector, we can write 

Pact (-i) 

in terms of two 2-component spinors X and X. 

For physical momentum, the components p 1 ' are real, of course. I invite you to verify, 
however, that everything we just did from (3) to (5) goes through even if pP are complex. 
It turns out that in the next chapters we will find it convenient to consider complex 
momentum. 

Upon first exposure, the formalism appears quite opaque, but actually, like a lot of 
formalisms, it is fairly simple or perhaps even trivial. If you are confused at any point in 
the following exposition, just work things out explicitly. For example, consider a physical 
momentum with p° = E > 0. With no loss of generality, you can choose p to point along 
the third direction, so that (with a trivial abuse of notation p = |/3|) 

0 ) 

\ 0 E + p) 

which for p lightlike collapses to the rank 1 matrix 

" =2£ C “HQ" 11 

Thus, in this case, X and X are both equal to 



numerically. (To make sure you get it, work this out for p pointing in some other direction.) 

You can think of the Pauli spinors X and X as the “square root” of the Lorentz vector p 1 ' . 
Note how the group theory discussion in chapter II.3 foreordained this rather nontrivial 
possibility. After all, there we saw how a Lorentz vector can be constructed out of two Dirac 
spinors u and u'. 

Interestingly, in discussing ferromagnets and antiferromagnets in chapter VI.5, we used 
a poor man’s version of (3), namely n — z' f az. 

You learned in school that the ordinary square root has a sign ambiguity. Analogously, 
in (5) p does not determine X and X uniquely. We can always rescale X -» uX and X -> -X 
for any complex number u. (You might have wondered what fixed the overall constant in 
X and X in the simple example above: I made an arbitrary choice.) 
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For real momentum, the matrix p a ^ = is hermitean, which implies that X = X* 

is the complex conjugate of X. The spinor X is not independent of X, and so the rescaling 
parameter u is restricted to be a phase factor e' Y . [Also, recall from appendix B how X M 
transforms under SL{2, C) and you will see that it is all consistent.] In this case, the 
condition that p has rank 1 allows for two solutions: p a „ = ±X a X^, with the two possible 
signs corresponding to whether p° > 0 or not. 

A side remark at this point: We will see that it is useful to consider the group SO(2, 2) 
instead of the Lorentz group SO (3, 1). Thus, as the discussion in appendix B indicates, 
you can also take the square root of an S 0(2, 2) vector and write p a „ — X a Xa, but with X 
and X two independent real spinors, as is consistent with the local isomorphism between 
SO( 2, 2) and SL( 2, R ) ® SL( 2, R ). The rescaling mentioned above is now restricted to u 
being a real number. 

It is instructive to count the number of real degrees of freedom for these different 
cases. A complex lightlilce momentum depends on 4x2 — 2 = 6 real numbers, since the 
condition p 2 now amounts to two real conditions, while X and X each contains 2 complex 
numbers, but with rescaling we are left with 2 x 2 — 1 = 3 complex numbers, that is, 6 
real numbers. A real lightlilce momentum depends on 4 — 1 = 3 real numbers, but now 
X is tied to X containing 2 complex numbers, which get reduced to 3 real numbers after 
rescaling by a phase factor. For a (real) lightlilce vector transforming under SO(2, 2), we 
have 2 real spinors, which after rescaling contains 3 real numbers. So it all works out, of 
course. 

I mention all this here for future use. It should be evident to you, for the rest of 
this chapter, which statements hold for complex momenta and which hold only for real 
momenta. At the end of the day, when we arrive at a physical quantity, such as the 
amplitude, we will of course set the momenta contained therein to be real. 

For two lightlilce vectors p and q, write p a „ = X a X^ and q a p — p a Pa> then we have 

P -q = (e°‘ ft X a pf t )(e a ^X 6l p, t 0 = (X, p)[X, p] (6) 

Here we have defined the two Lorentz invariants 

(X, p) = e af> X a p.p = -{p, X) (7) 

and 

[X, p] = e a i l X il pp = -[A, X] (8) 

(treating the spinors as c-number objects.) Note in passing that with our convention, 

= X 2 and X 2 — —X 1 , and so (X, p) = —X 1 p 2 + X 2 p 1 — —e a pX a pP. 

We have already verified in (E. 13) that (X, p) is invariant, but for the sake of total 
pedagogical clarity let us check it once more, this time using infinitesimal transformations. 
Write (E.4) more compactly as SX a = v^Xp, where a denotes some linear combination 
of Pauli matrices. Noting that (X, p) is nothing but Xa 2 p up to some irrelevant overall 
constant, we have indeed 8(Xa 2 p) = {Xa T a 2 p + Xo 2 ap) = 0. 

A notational remark: the twiddles in [X, p \are redundant. The square bracket is defined 
only for spinors transforming like (0, j). Henceforth, we will write [A,, p\ = e a ^X^pp. 
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For real physical momenta, X — X* so that (X, p) — [A, p]*. Then p ■ q = (X, p)[X, p] 
implies that (X, p) — p ■ qe ,(l> and [A., p] — p ■ qe~'^, with some phase factor e'^. We 
thus conclude that the two spinorial products may be regarded as the (two) square roots 
of the Lorentz dot product p ■ q up to a phase factor. 

You could now raise an interesting question: how do we write the polarization vectors 
e(p) of a massless gluon? 

The requirement that e(p) ■ p = 0 can be satisfied, according to (4), by setting e a g = 
d^XaPa, for an arbitrary p^ and with the factor d determined as follows. We require that, 
for an arbitrary complex number w, scaling p -> wp does not change e (since p is arbitrary 
after all). Thus d has to be linear in p. The further requirement that d be Lorentz invariant 
implies, as we just learned, that d — [x, p], where x is some (0, |) spinor. The only spinor 
available is X and hence we obtain 


e 


aa 


Xgpg 
[4, p] 


( 9 ) 


By convention, we will call this polarization negative helicity. 

The arbitrary choice of p^ represents the freedom inherent in a gauge theory. Indeed, 
we see that gauge transformation corresponds to the spinorial shift p -> p + yX (for some 
arbitrary number y) under which e a „ -> e a g + yX a X„, which translates into the usual shift 
of € by some multiple of p. 

The positive helicity polarization is given by the other possible choice 


6 


+ 

aa 


Po/Xg 
{p, X) 


( 10 ) 


Check that it works. Gauge transformation now corresponds to the shift /x -» p + yX. Note 
that the polarization vectors are normalized as e + • e~ = {pX)[pX]/((pX)[pX]) — 1. 


Taming the unholy mess 

Consider the tree-level scattering amplitude with n > 4 outgoing massless gluons. (In this 
and the next sections, we can take all momenta to be real.) The color-stripped amplitude 
is then characterized by a string of helicities (h\,, h n ). Take for example the amplitude 
with (+ + + ••• + +). Upon crossing, it describes two gluons, each with helicity —, going 
into n — 2 gluons all with helicity +. Both incoming gluons flip their helicity and thus 
this amplitude is said to be maximal helicity violating. Your intuition may tell you that 
this amplitude ought to be suppressed, since highly energetic massless particles tend to 
maintain their helicities. If you try to verify this using traditional Feynman diagrams, you 
would once again encounter a big mess. 

The spinor helicity formalism rides to the rescue. Consider the amplitude A(h ly ■ ■ ■ , /;„). 
For each of the n gluons, we have p ia „ = X ja X^, and an arbitrary spinor that we are 
free to choose (subject to some conditions), namely either p ja or p j6l , depending on 
whether the corresponding helicity is + or —, respectively. There are quite a few indices, 
but fortunately, in computing amplitudes, we encounter only Lorentz invariants, such as 
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€j ■ €j = e a ^s 0l ^e ic/ aej^ (be sure to distinguish between the two varieties of epsilon here!), 
and thus the spinor indices will be contracted over and disappear. In particular, we have 
(omitting the comma in the angled and square brackets) 


_i_ . |. (M/M/lP'b-/] 

' 1 (M/Xm/V 

_ _ _ 

e ‘ (j [hpiv-j^j] 

_ + _ (hPj)[Pi^j\ 

e ' €j [h^i]{^j x j) 

We also list for convenience 

(jljXj) [ 2 .^ 4 '] 

' ' (Pih) 

and 


e i ■Pj = 


(XjXj) [/X ( -Ay] 

[XjiXj] 


( 11 ) 

( 12 ) 

(13) 

(14) 

( 15 ) 


Evidently, in this formalism, flipping helicity corresponds to interchanging the brackets 
(• • •) and [• • •]. 

We need one more important observation. Obviously, in a tree-level diagram for n-gluon 
scattering, you cannot have as many 3-gluon vertices as you like. Draw the tree diagrams 
for n = 4 for example (see figure N.2.3). The number of 3-gluon vertices could be either 
0 or 2. In general, the number of 3-gluon vertices can be at most n — 2. You are asked to 
verify this in exercise N.2.2. As remarked earlier, while the 4-gluon vertex does not involve 
momentum, the 3-gluon vertex is linear in momentum. Thus, in the numerator of the 
Feynman amplitude, we have n polarization vectors e, but at most n — 2 momenta. We are 
to form a scalar out of these Lorentz vectors by taking dot products. Clearly, there are at 
least two polarization vectors who have to dance with each other. Therefore we conclude 
that the tree amplitude must contain at least one power of e ,■ • e,-. (In the /? = 5 case that 
power was actually 2 , as we saw.) 

Now we are ready to rock. For the amplitude A(+ + ■ ■ ■ +) (suppressing the momentum 
labels), we simply choose the spinors /u,- representing the gauge degrees of freedom to all 
be equal. Then all dot products e ( + • e t between polarization vectors vanish according to 
(11). But we just argued that the tree amplitude must contain at least one power of e, • €j. 
Remarkably, we have shown that the maximal helicity-violating amplitude vanishes for any 
n! Our intuition suggested that these amplitudes are suppressed, but in fact they vanish. 

What about the next-to-maximal helicity-violating amplitudes with one negative helicity, 
namely A( —)- + ••• + +)? Label the gluon with negative helicity as 1. Once again, for 
i — 2, ■ ■ - n, choose all equal to A. 1 . Then e ; + • oc (PiP j) = 0, for i, j ^ 1. Furthermore, 
e)“ • e ( + oc (ki/q) = (k 1 A. 1 ) = 0 for i ^ 1. The amplitude A( —!-+••• + +) also vanishes! 

Clearly, this “cheap” trick of exploiting gauge freedom no longer works for the next 
amplitude with two negative helicities. To see why the trick does not work any more, look 
at A( - !-••• + +) for instance. Once again we could, for i = 3, • • • n, choose /u,- all equal 
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(a) 

Figure N.2.3 




(c) 


so that ef ■ — 0 for i . j > 3, but then we don’t have enough freedom to make all the other 
polarization dot products vanish. In fact, at some point, we better have some nonvanishing 
amplitudes. In the literature, these amplitudes with two negative helicities are called 
maximal helicity-violating amplitudes. Upon crossing two of the gluons, they describe 
two gluons producing n — 2 gluons, with helicities +H—*- + + •••+, —I—> —(-•••+, 
and- > -)-•••+. 


Explicit calculation of A(i,2,3,4) 

The n — 4 case is the simplest. Take a deep breath and try to calculate A( 1~, 2~, 3 + , 4 + ) 
and A( 1”, 2 + , 3 _ , 4 + ). For 4-gluon scattering these two are the only nonvanishing tree 
amplitudes, since by parity the amplitudes with three minuses are related to the amplitudes 
with three pluses (which we know vanish), and so on. 

The bad news is that the calculation is fairly involved. The good news is that we can still 
exploit gauge freedom mercilessly and that the final answer is surprisingly simple. 

Tackle A(l _ , 2~, 3 + , 4 + ) first. The relevant diagrams are shown in figure N.2.3. Let us 
simplify the notation as much as possible: write (12) = (X 3 X 2 ), [12] = [AjA^], and so forth. 

Now we need the colored Feynman rules in the form given in appendix 1. In line with 
the preceding discussion let us choose jl 1 = jl 2 = X 3 and /r 3 = /r 4 = X 2 . Then all but one 
of the polarization dot products vanish. For instance, e 2 ■ oc (A 2 /r 3 )[/r 2 A 3 ] oc (X 2 X 2 ) = 0. 
The only nonzero product is ej" • — (X-yp, A )[p.]X^\/ {[X 3 p,{\{p, 4 X 4 )) = (12)[34]/([13](24)), 

where the second equality follows from our gauge choice. This implies that the quartic 
diagram N.2.3a vanishes, since it involves the product of two polarization dot products. 

We notice that there are only two more diagrams (fig. N.2.3b,c) rather than three. With 
the traditional Feynman rules there is a diagram with 1 and 3 on the same cubic vertex. 
Here we see another advantage of color stripping. We are looking at the coefficient of 
tr(T ai T a 2 T ai T a4 ). The diagram we just described has T ai next to T ai and so does not 
contribute to this particular color ordering. 

Next, the diagram in figure N.2.3b vanishes. Look at the cubic vertex involving 2, 3, and 
v (for the virtual gluon): (e 2 • e 3 e v ■ p 2 + e 3 • e v e 2 ■ p 3 + e v ■ e 2 e 3 ■ p v ), with e v understood 
as a “placeholder” to be contracted with the e* from the other cubic vertex. The first term 
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vanishes because e 2 ■ e 3 = 0 , the second term because e 2 ■ Pi oc [/x 2 k 3 ] = [I 3 A 3 ] = 0 , and the 
third term because e 3 • p v = — e 3 • (p 2 + p 2 ) = —e 3 • p 2 oc (/x 3 7. 2 ) — (X 2 X 2 ) — 0. Our gauge 
choice was wise indeed! 

Only one diagram (fig. N.2.3c) left to calculate. The cubic vertex {e x • e 2 e v ■ p x + e 2 ■ e v e 1 ■ 
p 2 + e v ■ e x e 2 ■ p v ) is to be contracted with the other cubic vertex (e 3 • e 4 e* • p 2 + e 4 • €*e^ ■ 
p 4 + €* ■ e 2 e 4 ■ (—p v )). In each of these vertices, the first term vanishes, since the only 
nonzero polarization product is e x ■ e 4 . To obtain the amplitude we replace the polarization 
product € p €®* for the placeholder by the propagator —ig pw /(p\ + p 2 ) 2 — —ig pm /(2 p 1 ■ p 2 ). 

Again, since all but one of the polarization dot products vanish, only one term survives 
the contraction with g pu> . We obtain A(l _ , 2~, 3 + , 4 + ) = ■ e 4 e 2 • P\St, ■ p 4 /p\ • p 2 - 

Since we are after conceptual understanding more than anything else, now and hence¬ 
forth, in this and the next two chapters, we will suppress overall factors to keep various 
expressions as uncluttered as possible. 

We have already calculated e x ■ e 4 , so it remains to evaluate e 2 ■ p x = (21) [31]/[23], e 3 • 
p 4 = (24>[34]/(23>, and Pl ■ p 2 = (12)[12]. Thus A = (12)[34] 2 /([12][23]<23». 

We can now use various identities to write this in a more symmetric form. First, momen¬ 
tum conservation gives J2i Pal = Xa = 0. Multiplying this by s^ a e a ^ we ob¬ 

tain J2i(ji)[ik] — 0 for any j andk. Second, we have (34)[34] = p 2 ■ p 4 = p x ■ p 2 — (12) [12]. 
Finally, the spinors can be regarded as 2-dimensional vectors and so any two spinors /x and 
v span the space. Thus a third spinor X can always be expanded as a linear combination of 
the other two, viz, X = ((Xv)p — (Xp.)v)/(p,v), with the coefficients determined easily by 
contracting with /x and v. Contracting with a fourth spinor i] then yields 


(Xri){pv) = {Xv)(pii) - (Xp){vrj) 


(16) 


known as the Schouten identity. 

Using these identities we now massage A into shape. Multiply the numerator and 
denominator of A by (34) to obtain (12) 2 [34]/((23) (34) [23]). Next, multiply the numerator 
and denominator by (12) 2 . In the denominator write (12)[23]= —(14)[43]. Finally, we 
obtain (suppressing overall phase factors, as promised) 


A(l“, 2~, 3+, 4+) = 


( 12 / 


P 1 ' P2 


(17) 


(12) (23) (34) (41) P2 - Pi 

Compare this with figure N.2.2. You should be impressed, even though here we are doing 
the n — 4 rather than the n = 5 case. 

Recall that we have another amplitude A(l“,2 + , 3“,4 + ) yet to calculate, in which the 
two negative-helicity gluons are not adjacent in color. You should work this out as an 
exercise, but it turns out that we can use a trick. Write the analog of (2) for 4-gluon scattering 


M = i J2 tr(T ai T a 2 T ai T a *)A(r, 2 +, 3“,4+) (18) 

permutations 

We have already remarked that if we extend the gauge group from SU ( N ) to U(N), the 
extra gluon (known in the literature, perhaps confusingly, as the “photon”) does not couple 
to the other gluons (because the couplings in Yang-Mills theory all involve commutators; 
see appendix 1.) Thus if we replace, say T ai , by the identity matrix, the entire sum should 
vanish. The six terms in the sum then break up into two groups, multipled by either 
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tr(T a 'T a iT a *) or tr(T ai T a4 T a3 ). Since the two traces are independent, the two groups 
vanish separately. The traces in the three terms Xx(T ai T“ 2 T ai T aii )A(l~, 2 + , 3~, 4 + ) + 
ti(T ai T a} T a2 T a *)A(l~, 3“, 2+, 4+) + tr(T^T^T^T a2 )A(l~, 3“, 4+, 2+) all become 
ti(T ai T a2 T° 4 ). We thus obtain the so-called photon decoupling identity A(l _ , 2 + , 3 _ , 4 + ) 
+A(1~, 3 _ , 2 + , 4 + ) + A(l“, 3 _ , 4 + , 2 + ) = 0, relating the desired amplitude to two ampli¬ 
tudes already known from (17). Thus 

A(l“, 2+, 3“, 4+) = — (A(l _ , 3“, 2+, 4+) + A(l“, 3“, 4+, 2+)) 


— — (13) 


+ 


(13)(32)(24)(41) (13) (34) (42) (21) 

(13) 4 


(12) (23) (34) (41) 


(19) 


where we used the Schouten identity. 

Remarkably, the two amplitudes A(l _ ,2“,3 + ,4 + ) and A(l“, 2 + , 3“, 4 + ) have the same 
form. It is tempting to conjecture that for n-gluon scattering, the maximal helicity-violating 
amplitudes in which two of the gluons carry negative helicity and the rest positive helicity 
is given by the elegant expression (for n > 4) 


A(l + , 2+, • ■ • j~, ■■■ ,k~ ■■■ n + ) =-—- (20) 

(12)(23) (34) • ■ ■ ((n — l)n)(nl) 

This conjecture was first put forward by Parke and Taylor and proved by Berends and Giele 
(using an off-shell recursion method and a precursor to the on-shell recursion method to 
be explained in the next chapter.) We will prove it in the next chapter. 

Meanwhile, we note that one way of arguing for the conjecture’s validity is to verify 
that the proposed amplitude satisfies all the symmetry requirements. Besides Lorentz 
invariance (obviously satisfied), amplitudes at tree level in a massless theory like pure 
Yang-Mills should also satisfy scale and conformal invariance. 

One interesting check is to count, for each i, the powers of X t minus the powers of X t . 
Call this quantity A,-. Then since momentum has the form it contributes 0 to A,-. In 
contrast, for negative helicity e“ A = X a jl^/[X, /x] ~ X/X. For positive helicity we have the 
opposite: eh, = li a X^/{n, X) ~ X/X. Thus we have A,- = —2 h t . 

We checked that indeed, in (20), we have A,- = 2 for i — j, k and A,- = —2 for i ^ j , k. 
Keeping track of A,- during the calculation also provides us with a useful check. 

Note that the n — 5 scattering amplitude, which we started this chapter with, is 
completely determined, since there are only two independent nonzero amplitudes: 
A(l“, 2~, 3+, 4+, 5+) and A(l“, 2+, 3“, 4+, 5+). 


Further developments 

The astonishing simplicity of (20) has sparked a surge of interest and further develop¬ 
ments. Here I will be content to mention some of them. 

Once the tree amplitudes are done, one can calculate loop amplitudes by using a 
more sophisticated version of the unitarity methods and of the Cutkosky cutting rules 
of chapter II.8. Proceeding in this way, various authors have bootstrapped their way up to 
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multiloop amplitudes. While the actual computational labor can quickly get out of hand, 
it is still enormously less than the labor needed with traditional Feynman methods. 

What about the basic cubic vertex of the theory? We will work it out in appendix 2 and 
show that it fits nicely into the form in ( 20 ) with one important caveat. 

Surely you, the astute reader, feel that there must be some deep reason for the aston¬ 
ishing simplification from the mess in figure (1) to the elegant expression in (20). Indeed, 
tree amplitudes in gauge theories (and in gravity) turn out to be even simpler when writ¬ 
ten in terms of the twistors studied by Penrose decades ago. As this exciting development 2 
occurred while this book was going to press, I have to balance my desire to make the book 
as up-to-date as possible against pagination constraints. Thus I can provide here only an 
ultra-concise (and hence perhaps somewhat cryptic) key to the literature, giving you no 
more than a flavor of what is involved. 

Include the momentum conservation delta function with the amplitudes of the type 
studied here and define M(• • • , A X,-, • • •) = A(A, A)c5 (4, (^" =1 XjXj). Due to space con¬ 
straints, I will suppress the kinematic dependence of M on all but the particle i and write 
simply M(Xj , A,). Let us Fourier transform M in two possible ways (and overuse the letter 
M somewhat): 

M(Wj) = J d 2 X t expO'/l“A ia )M(A,-, A ; ). (21a) 

and 

M(Z ; ) = J d 2 A,exp(i/xfA ld )M(A ; ,A,). (21b) 

where W = (jl,X) and Z = (A, p) denote two 4—component objects which may be re¬ 
garded for the time being as column “vectors.” The intent here is to transform M se¬ 
quentially for i = 1, 2, • • • n using either (21a) or (21b). Consider SO(2, 2) here instead of 
SO (3, 1), so that the spinors A and A are real, and hence we can take p and ft to be real 
as well. Thus, these integral transforms are no more and no less than the Fourier trans¬ 
forms you have long been familiar with, and the variable p is conjugate to the variable A 
in the same sense that p is conjugate to q in quantum mechanics. The objects W and Z, 
known as a twistor and a dual twistor and conjugate to each other, each consisting of 4 
real components, naturally transform under the group SL (4, R ) (namely the set of all 4 by 
4 matrices with real entries and unit determinant), with the invariant W ■ Z — ft A + Xp. 
Given more than one VP’s and Z’s we also have the Lorentz invariants Z]/Z 2 =< A 1; A 2 > 
and Wf/ W 2 = [A 1( A 2 ]. (Here 7, in a slightly abused notation used in the literature, evidently 
denotes the 4 by 4 matrix containing the 2 by 2 identity matrix either in its upper left corner 
or in its lower right corner depending on whether it acts on W or Z, with all other entries 
equal to zero.) 

We have (displaying the helicity h of particle i while suppres sing the index i) M (tW , h) — 
f d 2 A exp(itpX)M(X, tX, h) = t ~ 2 f d 2 X' exp(^/xA , )M(^ _ 1 A , , tX, h) = / 2 ( 7 ,- b M(W , h) 
where we used the observation earlier that A = —2h, namely that M(f _1 A, tX) — 
t 2 l 'M( A, A). Similarly, M(tZ, h) — t~ 2 ^ l,+v> M(Z, h). This scaling result, which you realize 
comes from the little group (see p. 186), indicates that we should favor a mixed or am- 

2 The literature on twistors could be traced starting with the paper mentioned on p. 477. 
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bitwistor representation for the scattering amplitude, using W when the particle carries 
+ helicity and Z when the particle carries — helicity. 

For example, for the basic Yang-Mills cubic vertex with helicities (+ H—) (see appendix 
2) we write M(W+ , W 2 , Z^). The scaling relation just derived imposes powerful con¬ 
straints on this amplitude, namely M(W\, W 2 , Z 3 ) = M(tW\, W 2 , Z 3 ) = M(W\, tW 2 , Z 3 ) = 
M(W 1? W 2 , tZ^), which implies that in the ambitwistor representation the defining ver¬ 
tex for Yang-Mills theory is apparently, up to an irrelevant overall constant, just l! More 
precisely, M(W\, W 2 , Z 3 ) depends on the three possible invariants W\ ■ Z 3 , W 2 ■ Z 3 , and 
W 2 IW 2 . The scaling relations (note that t could be either positive or negative) then force 
M to have the amazingly simple form 

M(W+, W+, Z“) = signOVj • Z 3 )sign(lV 2 • Z 3 )sign(Wq/ W 2 ) 

In different kinematic regions, the basic Yang-Mills vertex is numerically equal to ±1. 

Tree amplitudes live naturally in ambitwistor space. As another example, the 4-gluon 
scattering amplitude (19) we worked hard to get becomes simply 

M(W+, Z-, W+, Z-) = signOVj ■ Z 2 )sign(Z 2 • M/ 3 )sign(iy 3 ■ Z 4 )sign(Z 4 • W{) 

Let’s anticipate a bit and write the basic cubic vertex for gravity to be given in (N.3.20) 
in this ambitwistor representation. Indeed, the scaling relations derived above could 
be immediately applied to the graviton, for which li — ±2. We obtain M(tW , ++) = 

t 2 M(W , ++) and M(tZ, -) = t 2 M(Z, -), thus immediately fixing the cubic vertex 

for gravity to be 

M(W++, W++, Z~~) = KWh • Z 3 )(W 2 ■ Z i )(W 1 IW 2 )l 

Going from Yang-Mills to Einstein-Hilbert, we merely have to replace the sign function by 
the absolute value! 

Clearly, the take-home message is that quantum field theory possesses hidden structures 
that the traditional Feynman diagram approach would likely have no hope of uncovering. 


Appendix i: Colored Feynman rules for Yang-Mills theory 


Using the double line formalism of chapter IV. 5, we can draw the cubic and quartic vertices in Yang-Mills theory 
as in figure IV.5.2. Our conventions for the generators of SU(N) are [T a , T b ] = if abc T c and \x(T a T b ) = \8 ab . 
Thus f abc = —2itr([T a , T b \T c ). Start with the Feynman rule for the quartic vertex given in chapter VII.1 and 
appendix C. First, f abe f cde = —4tr ([T a , T b ][T c , T d ]). Next we multiply by polarization vectors and obtain the 
colored rule for the quartic vertex: 

Aig 1 \x(T a T b T c T d )(e 1 ■ c 2 6 3 • e 4 - e 4 • • e 3 ) (22) 

The two other terms are obtained by permutation. Similarly, the cubic vertex in (C.18) becomes (with a trivial 
change k —>• p) 

-4Igtr(r a r i r c ')(e 1 • e 2 e 3 ■ p 1 + e 2 - • p 2 + e 3 • e 2 e 2 ■ p 3 ) (23) 

As described in the text, we can now strip off the color factors Xx(T a T b T c ) and tr(T a T b T c T d ). 

Color stripped amplitudes satisfy a number of useful identities. For example, the color stripped amplitude for 
n -gluon scattering satisfies the reflection identity A(l, 2, • • • , n ) = (—1 ) n A(n, • • • , 2, 1). To show this, note that 
the stripped quartic vertex * € 3 ) does not change sign under the reflection 1234 —► 4321, 

while the stripped cubic vertex changes sign under 123 —> 321. From exercise N.2.2, V 3 + 2V 4 = n — 2, and thus 
V 3 is odd or even according to whether n is odd or even. 
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Appendix 2: The cubic vertex in the spinor helicity formalism 


A rather natural question to ask is what the cubic vertex (23) looks like in the spinor helicity formalism. 

The first observation is that if we put all momenta on shell, p\ = p\ = p\ = then the cubic vertex actually 
vanishes. By momentum conservation, we have p 2 = ( P 2 + p{) 2 = P 2 • P 3 = 0. The conditions pi ■ pj = 0 then 
imply all three lightlike momenta point in the same direction, so that p t = 1, 0, 0, 1), i = 1, 2, 3. But this 

means that, for example, 63 • P\ oc 63 • p$ = 0, and thus the cubic vertex (23) vanishes. 

Now you see the motivation for allowing the momenta to be complex. Then the conditions p t • pj = 0 no 
longer force all three lightlike momenta to point in the same direction, and we can have a nonvanishing cubic 
vertex on shell. As explained in the text, to complexify momentum, we simply remove the constraint X = X*. By 
the way, by complexifying the momenta here, we are anticipating the discussion in the next chapter a bit. 

As always, we are free to choose the p spinors to our advantage. A good choice here is p\ = p 2 and /x 3 = 
X x . Referring to (12) and (13), we then have ef • oc \P 1 P 2 l = 0 and oc {X^p^) = 0. The cubic vertex 

collapses to 


/ (^•2/ x 3)[M2^3] N \ / (AiA 2 )[MiA 2 ] \ _ (12) 2 [^\X^\ 
\ [^2M2](M3^3) / \ [^lMl] / (13) [/iiXJ 


As in the text, we are ignoring all overall factors. 

To get rid of the unphysical p\, we need a variant of the momentum conservation identity given in the text. 
Multiplying = JA ^4°^° = 0 by jiy, we obtain = 0 for any j, which for j = 2 

implies [/t 1 A. 3 ](A. 3 l 2 > = -[Ati^-iK>-i^ 2 >- 

Multiplying (24) by (A. 3 A 2 )/(A. 3 A. 2 ) and applying the identity just derived we finally obtain the “mostly minus” 
cubic vertex 


A(l“, 2~, 3 + ) = 


( 12) 4 

(12) <23) (31) 


(25) 


Satisfyingly, we have obtained an expression consistent with (20) (which we have not yet proven) but keep in 
mind that (25) holds only for complex momenta. I leave it to you to obtain the “mostly plus” cubic vertex 


A(l+, 2• , 3 ) 


[ 12] 4 

[12][23][31] 


(26) 


which also follows from the rule about flipping helicities stated in the text. 

What about the “all plus” and “all minus” vertices? By now you should be able to determine them as a simple 
exercise. 


Exercises 

N.2.1 Work out the two polarization vectors for general p and p for a gluon moving along the third direction. 

N.2.2 Show that the number of cubic vertices in tree-level rc-gluon scattering can be at most n — 2. 

N.2.3 Show that the result in (17) satisfies the reflection identity A(l _ , 2~ , 3 + , 4 + ) = A(4 + , 3 + , 2 ~, 1 _ ). 

N.2.4 Show that the “all plus” and “all minus” cubic Yang-Mills vertices (see appendix 2) vanish. [Hint: Choose 
the p spinors wisely.] 

N.2.5 Why doesn’t the argument in the text that A(—(- + ••• + +) vanish apply to A(—b +)? 

N.2.6 Insert the expression for the cubic vertex into (21) and derive M(W W^, Z7). 

N.2.7 Show that M(Wj + , ZJ, , Z^) reproduces (19). 

N.2.8 Show that SL(A, R ) is locally isomorphic to the conformal group. [Hint: Identify the 15 = 4 2 — 1 genera¬ 
tors of the conformal group (3 rotations J 1 , 3 boosts K', 1 dilation D, 4 translations P 1 *, and 4 conformal 
transformations K 11 ) with the 15 traceless real 4 by 4 matrices.] 




Subterranean Connections in Gauge Theories 


Excess baggage 

This text, like all texts on field theory, sings the praise of gauge theories—hey, Nature loves 
them regardless of what physicists like—but, unlike many texts, emphasizes repeatedly 
that gauge symmetry is strictly speaking not a symmetry, but a redundancy in description. 
Extra degrees of freedom are introduced only to be gauge fixed away. In the first edition of 
this book, I expressed in the Closing Words the hope that in the future physics will find 
a more elegant way of formulating this peculiar concept of local invariance. Perhaps that 
hope is being realized sooner rather than later! 

In our current formulation of gauge theories, for a process involving n massless gauge 
bosons (photons or gluons) we are instructed to laboriously calculate an off-shell amplitude 

But experimentalists don’t know about amplitudes carrying Lorentz indices! SE from 
chapter III.l speaks up again. “My gauge bosons are specified by their helicities h j? i = 
1, • • • n, not a Lorentz index.” 

Come to think of it, we theorists do go through a strange two-step procedure involving 

a lot of excess baggage. After toiling to obtain A / F'' 1/i2 '" /J '" with external momenta off 

shell, we then set external momenta on shell and contract with polarization vectors to 

determine the scattering amplitude for gluons in specified polarization states J\A XlXr " Xn = 

e Xl e Xl . .. Af AtlA ' 2 ''' At,I | ons heii- In effect, in step 2 we wash away much of the unnecessary 

Ml M2 Mm 

information in A/P 1 ^ 2 '''^" we worked hard to get in step 1. 

The cancellation in the 5-gluon scattering in the preceding chapter, with ~10,000 terms 
boiling down to a single term, should have convinced you that the traditional Feynman 
way may not be so good. In your study of physics, you surely have had the pleasure of 
watching terms canceling against each other toward the end of a calculation, but 10,000 
terms down to 1, that was the mother of all cancellations. 

The key is that in gauge theories there is a kind of secret subterranean connection 
between different Feynman diagrams, and cancellations are routine. Gauge invariance tells 
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us that p 1 (e 2 ■ ■ ■ e" _A/P lA ' 2 ''' A ''' , | ons j le ii), f° r example, vanishes. Thus, the many diagrams 
that go into _A / P lA '' 2 '" A ' n must know about each other in some intricate way. (We saw a 
glimpse of that way back in chapter II.7 when we proved gauge invariance.) 


The S-matrix reloaded 

In spite of the tremendous difficulties lying ahead, I feel 
that 5-matrix theory is far from dead and that . . . much 
new interesting mathematics will be created by attempting 
to formalize it. 

—T. Regge 1 

The garbage of the past often becomes the treasure of the 
present (and vice versa). 

—A. Polyakov 2 

As discussed in chapter III.8, back in the 1950s and 1960s, dispersion theorists 3 tried 
to forge ahead by studying the analytic properties of various amplitudes as functions 
of their external Lorentz invariants, namely the Mandelstam variables s and t for 2-to- 
2 scattering and q 2 in our simple vacuum polarization example. But once one gets past 
2-to-2 scattering, the analytic structure becomes unwieldy. The program failed and was 
swept into the dustbin of physics history. (However, you might know that, through a rather 
convoluted process, this massive effort eventually gave birth to string theory.) 

Remarkably, some features of this program are being revived. In particular, in this 
chapter we will discuss the notion of complexifying physical variables. In an interesting 
twist, it turns out to be better to complexify the external momenta (to be explained below) 
rather than invariants like s and t. A historical aside: Landau apparently suggested on one 
occasion that it might be useful to consider complex momenta. 

Consider the amplitude Ai(p ,, hj ) for tree-level scattering of n massless particles with 
momentum and helicity (/?,-, /;,), i = 1, . . ., n, with p 2 — 0. (For a gauge theory, we will 
define the amplitude with the color factors already stripped away. Also, suppress the trivial 
multiplicative coupling constant dependence and drop all such overall factors as we move 
along.) 

The novel idea is to pick two external momenta p r , p s , complexifying them while 
keeping them on shell and maintaining momentum conservation. We take all momenta 
as incoming. At this stage we can keep the discussion general and not even specify the 
theory except to stipulate that it contains only massless particles. But to fix ideas, you can 
imagine a gauge theory. For some complex number z, replace p r and p s by 

Pr ( z ) = Pr + ZC I and PjO) = Ps - Z P (1) 

1 T. Regge, Publ. RIMS, Kyoto University, 12 suppl.: pp. 367-375,1977. 

2 A. Polyakov, Gauge Fields and Strings, CRC, 1987, p. 1. 

3 See, for example, G. Barton, Dispersion Techniques in Field Theory, W. A. Benjamin, 1965. 
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To keep p r (z) 2 = 0 and p s (z) 2 = 0, we need q ■ p r s = 0 and q 2 = 0, which is possible only 
if we make q complex. To be explicit, go to a frame in which p r + p s has only a time 
component and use units so that the time component is equal to 2. Then 

= (1,0,0,1), ft, = (1,0,0,-1), q = (0, 1, 1 ,0) (2) 

A technical aside. This is why I mentioned SO (2, 2) in appendix B and in the preceding 

chapter: with a (+ H-) signature one could satisfy the on-shell constraint without having 

a complex q. Here I will stick with the more physical SO (3, 1) and consider complex 
momenta as explained in the preceding chapter. Another side remark: As you will see, 
the discussion goes through for any spacetime dimension d > 4. 

With this set up the scattering amplitude Ad(z) becomes an analytic function of z. Think 
of the complex momentum zq flowing into the diagram with ft.(z), cruising through some 
of the internal lines, and then flowing out with —p s (z). Let us turn on our pole detector. 
At tree level, a pole can arise only from a propagator carrying momentum zq + ■ ■ ■. Thus 
the tree amplitude Ai(z) has only simple poles, coming from diagrams of the type shown 
in figure N.3.1. Divide p t into two sets L and R, with those in L flowing into a blob on 
the left-hand side and those in R flowing into a blob on the right-hand side. The two 
blobs are connected by a single propagator carrying momentum Pi(z), which by arbitrary 
convention we choose to flow into blob L. The two blobs are themselves tree amplitudes in 
the theory. Let n L and n R be the number of external momenta in sets L and R, respectively 
(with n L + n R = n, of course, and n L > 2, n R > 2). Then the left-hand blob represents tree 
scattering of n L + 1 particles, with n L particles on shell and one particle with momentum 
P L {z) off shell, with an amplitude Af^(z). Similarly, the right-hand blob represents tree 
scattering of n R + 1 particles, with n R particles on shell and one particle with momentum 
— P L (z ) off shell, with an amplitude M R (z). 
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Clearly, the momentum P R (z) depends on z only if p r (z) and p s (z) do not appear in 
the same set. With no loss of generality let p r (z ) belong to the set L and p s (z) to the set 
R. Then P L (z) = p,) + zq) = P L { 0) - zq and P L (z) 2 = P L { 0) 2 - 2zq ■ P L ( 0) = 

—2 q ■ P L ( 0)(z — z L ), where z L — P L (0) 2 /(2q ■ P L (0)). Thus M has a pole at z = z L , which, 
since q is complex, is in general complex. 

The amplitude A4 has poles all over the complex z-plane, at z = z L , one for each 
valid partition of the external momenta into L + R. The residue has the factorized form 
1Z L — M L (z L )M R (z L )/{2q ■ P L ( 0)), where M L (z L ) and M R (z L ) are now both on-shell 
amplitudes, since the particle carrying momentum P R (z L ) is now on shell. As always, we 
suppress all inessential overall factors. 

If, and that is a crucial if, A i(z) -> 0 as z -> oo, then j> c (dz/z)A4(z) — 0, where the 
contour C is a circle of infinite radius running along infinity. We then shrink the contour, 
picking up the pole at z = 0, which contributes yVl(O) to the contour integral, and a bunch 
of poles at z — z L , contributing a sum of terms consisting of the residue at each pole, 
multiplied by 1/z^. We thus determine the scattering amplitude to be 


M( 0) = - V — 
L,h z l 


E 


M l (z l )Mr( z l) 

P L ( 0) 2 


(3) 


Note that the sum also runs over the helicity h carried by the intermediate particle P L . 

The notation has been a bit compact, but suffices to get the essential point across 
without cluttering the page with bloated expressions. But let us now make the notation 
a bit more precise. To start with, z L of course depends on the specific partition L through 
the momentum P L . To make sure you follow, let us describe Ai L (z L ) more explicitly. It is 
an on-shell amplitude with (n L + 1) particles coming in, respectively carrying momentum 
andhelicity ( p r (z L ), h r ), ( p t , /i ( ) for i e L, i r, and (P L (z L ), h). Two of the momenta are 
complex, namely p r (z L ) and P L {z L ). Let us emphasize that by construction P L (z L ) 2 — 0 
and so all particles are on shell. Similarly, A4 R (z L ) is an on-shell amplitude with (n R + 1) 
particles coming in, respectively carrying momentum and helicity (p s (z L ), h s ), ( p t , /j ( )for 
i e R, i ^ i, and (—P L (z L ), —h). 

The crucial point is that, amazingly, as was discovered by Britto, Cachazo, Feng, and 
Witten, we can determine the n-point tree amplitude Af(z) in terms of lower point on-shell 
tree amplitudes, specifically (3) as a sum over products of the ( n L + l)-point amplitude 
M l (z l ) and (n R + l)-point amplitude M R (z L ). Note that n — 1 > n L + 1 > 3 (similarly 
for n R + 1), and thus by applying these so-called BCFW recursion relations repeatedly, we 
can calculate any on-shell tree amplitude in Yang-Mills theory and in gravity in terms of an 
irreducible 3-point amplitude. Furthermore, in the primitive 3-point on-shell amplitude, all 
Lorentz invariants constructed out of the momenta vanish, since p t ■ p: — \{Pi + p ,) 2 = 0. 

To determine the loop amplitudes, Bern, Dixon, and Kosower have generalized the 
unitarity methods sketched earlier in this text and alluded to in the preceding chapter. With 
these methods, one can calculate all amplitudes, trees and loops, and thus determine the 
theory completely in terms of the helicity dependence of the 3-point on-shell amplitude. 
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Amazingly, the old dream of the 5-matrix school comes true! Everything within pertur¬ 
bation theory is determined without our ever having to refer to a Lagrangian. 

Note that to obtain physical amplitudes we need only A4(z = 0) but to recurse to higher 
point amplitudes we need to know M(z ^ 0). As we will see, once we have M(z — 0) we 
can obtain M(z ^ 0) by analytic continuation. 

As emphasized in the preceding chapter, the decomposition of a lightlike vector in terms 
of spinors 

Paa — P lad ^-oAri (4) 

works equally well for complex lightlike vectors. In that case, as already explained in 
chapter N.2, the two spinors X and X are independent of each other. 

Another side remark: The deformation (1) considered here has a nice form in the 
helicity spinor formalism of the preceding chapter. Let p r — X r X r and p s = X S X S (with 
spinor indices suppressed). Then the spinor deformation X r -> X r + zX s andk s -> X s — zX r 
(leaving X r and X s unchanged) gives the desired momentum deformation with q — X r X s , 
which we see is not hermitean and hence corresponds to a complex momentum. This is 
consistent with the discussion in the preceding chapter, since the deformation obviously 
does not respect the equality between X and A* necessary for real momenta. 


The naive person about to recurse 

Imagine that you woke up one morning and had the wonderful idea of complexifying 
momentum. Then suppose you had enough wits, after a bow to Cauchy, to discover these 
marvellous recursion relations. But after you calmed down, you wanted to try the recursion 
out on some theory. Naturally, you first chose a scalar field theory, say a ^ 3 or a tp 4 theory. 

Your enthusiasm dies immediately. In these theories, the basic vertex is just a number, 
the coupling. For an n -point amplitude, there are always some Feynman diagrams in which 
p r (z) and p s (z) meet at one of the basic vertices, and the entire diagram does not even 
depend on z. The crucial assumption that A4(z) -* 0 as z -> oo is simply not true. 

Most physicists might give up at this point, but suppose you were possessed of strength 
of character and decided to take a look at Yang-Mills theory, thinking that, after all, it seemed 
much more fundamental than some dumb scalar field theory. But a quick look convinces 
you that things are even worse. Consider the diagram in figure N.3.2a contributing to the n- 
gluon amplitude. Put p r (z) and p s (z) as “far apart” as possible to maximize the number of 
propagators between them. There are (n — 3) propagators, contributing a factor of 1/z" -3 
to the amplitude as z —> oo. But alas, this is overwhelmed by (n — 2) cubic vertices with 
each vertex linear in momentum, thus contributing a factor of z n ~ 2 . 

You have not yet included the polarization vectors, which for z = 0 are given by e~ — 
q, = q* and e~ — q *, = q (note that q ** q* under r ** s, since the two momenta 
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n-2 



(a) 




Figure N.3.2 
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point in opposite directions). We also have to deform them to maintain their orthogonality 
with the corresponding momentum vectors: 

e,7 (z) = q , e+(z) = q* + zp s (5) 

and 


e s (z) = q* - zp r , e+(z) = q 


( 6 ) 


You should check that all conditions are satisfied, for example, e+(z) • p r (z) — (q* + 
z Ps)(Pr + z l) — z (<l* c l + PsPr ) — 0- [I n the notation of (N.2.9,10), the polarization vectors 
here correspond to the choice p r (z) = /u. r (0) = X s , jl r (z) — /2,.(0) = X s , /x s (z) = p- s (0) = 
X r , fi s (z) — jl s ( 0) = X r . The first equal sign in each relation simply emphasizes that we 
choose not to deform the p.’s and ft’s.] 

Note the peculiar asymmetry between r and s after deformation: in particular, two of the 
polarization vectors, and e~, grow with z and thus worsen the large z behavior. Putting 
it all together and referring to (5) and (6), you would conclude [with the notation M hrl ' s (z)] 
that 


M~ + (z) -►-= z, M r orH 

naive v ’ z n ~^ naive 


h (z) ■ 


• z , • (z) ■ 

’ naive v ’ 


( 7 ) 


which most certainly do not -> 0. 

A seemingly unimportant comment that will become important later: Of course, some 
of the gluons other than r and s could first interact among themselves as shown in fig¬ 
ure N.3.2b. This merely reduces the effective n in the discussion above for those particular 
diagrams, and we reach the same naive estimates. 


Reality more benign than expectation 

Reality turns out to be much more benign than our naive expectation! Actually, amplitudes 
in Yang-Mills theory behave better than amplitudes in scalar field theory, the opposite of 
what we thought. 

This fact is either astonishing or not so astonishing, depending on how jaded you are. I 
have to admit that it sounds a bit less amazing after learning in the preceding chapter that 
~10,000 terms can cancel down to a single term. 

We can even concoct a heuristic physical argument. Go back to Yang-Mills theory and 
call the particles gluons, as before. Apply crossing to gluon s, so that we have an incoming 
gluon r with a huge momentum p r (z) ~ zq in the large z limit, emerging as a gluon with 
the huge momentum —p s {z) ~ zq. The other (n — 2) gluons have fixed momenta and are 
thus soft. We have a hard gluon blasting through a soft gluon background, something like 
a high energy gamma ray blasting through a magnetic field, and thus we do not expect 
much scattering as z —> 00 , and even less scattering that would flip the helicity of the hard 
gluon. (The situation is conceptually similar to electron scattering in an external Coulomb 
potential, discussed in chapter II.6, except that here the field excitation being scattered is 
of the same type as the background field.) 
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But not so fast! Even though you and I have studied physics for years, we haven’t built 
up much intuition about complex momenta. At least I speak for myself. Alternatively, we 
could go to SO (2, 2) and deal with real momenta, but we haven’t much experience with 
signature (+ H-) spacetime, either. 


Background field method 

Nevertheless, the picture of a hard gluon blasting through a soft background turns out 
to be helpful in guiding us toward an elegant formulation of the problem. We split 
the Yang-Mills gauge potential (which we write as A on this occasion) into two pieces, 
A(x) — A(x) + a(x), a background potential A without high momentum components in 
its Fourier transform and a fluctuating potential a with high momentum components. (You 
would do exactly the same split when studying a laser beam passing through a laboratory 
magnetic field.) To develop this so-called background field method (which is useful for 
other problems besides this one), it pays to use the differential form notation used in 
chapter IV. 5. 

We split the transformation law A —> UAU' + UdU' into AUAU ] + UdU f and 
a -> Ua U 1 . In other words, the background A transforms like a Yang-Mills potential, while 
the fluctuating a transforms like a matter field in the adjoint representation. Plugging into 
the field strength T = dA + A 2 — d( A + a) + (A + a) 2 , we find T equal to the sum of the 
background field strength F — dA + A 2 and the 2-form da + Aa + a A + a 2 — (3 ^a v + 
[.A , a y ] + a^a v )dx^ i dx v = ^(Z)^ — D ,,a + [a^ , a v ])dx^ l dx v . Switching back from math 
to physics notation and defining the shorthand notation = D fl a v — D v a we have 

T tlv = F /IV + — i[a fl , a v \ Here D^a v — d^a^, — i[A fl , a y ] is the covariant derivative 

(with respect to the background potential A) of the adjoint field a. 

Since we have only two hard gluons interacting with the soft background, it suffices to 
expand the Yang-Mills Lagrangian to quadratic order in a: 

2 g 1 

= -y7l tr (+ D [lu a v] D^a v 1 + 2F^D [lu a v] - 2iF^, a,]) + 0(« 3 ) (8) 

Since in the action we integrate C over spacetime, we are effectively allowed to inte¬ 
grate by parts. Thus the third term in the parenthesis, Xr(F llv D tl a v ) = tr(F /iV (3 At a l , — 
i[A^, a v ])) “=”tr((Z) /i F /iV )a y ). Since the background field satisfies the field equation 
D ll F llv = 0, this term vanishes. (You should not be surprised that the term linear in a 
in the action is linear in the field equation.) 

Thus, to study the propagation of a through the background A, we can focus on the 
Lagrangian quadratic in a: £ qua( j = — (l/g 2 )tr ((D /1 a v — D v a fl )D ll a v — iF^la^, a,,]) As 
always, we need to fix the gauge. Upon integration by parts we have 


tr D v a ll D^a v = tr (D fl a ll D v a v + iF^ v [a^, a v }) 


(9) 
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Note that, unlike ordinary derivatives, when the gauge derivatives D v and D p pass each 
other, they produce the field strength F p . [Verify this! You might recall the more mathe¬ 
matical form (IV.5.13).] Thus a convenient way of fixing the gauge is to add tr (D /x a lx D v a v ) 
so that the gauge-fixed Lagrangian becomes 

dquad = — wtr( D tL a v D^a v - 2iF llv [a a v ]) (10) 

8 

[Incidentally, this parallels precisely what was done in (III.4.8) to obtain the Feynman gauge 
with £ = 1.] 

Return now to our problem of studying the large-z behavior of the scattering amplitude 
A4 xp . Recall from (7) that we obtain A4 Xp -> z" _2 /z" _3 = z. (Also recall that polarization 
vectors have not yet been included, and they could multiply this behavior by z°, z 1 , or z 2 .) 

The culprit is the derivative in the cubic vertex ~Aada sitting inside the first term in 
(10). In contrast, the ~AciAa piece in the first term and the second term tr F pv [a ll , a ,,] in 
(10) insert quartic vertices that do not grow with z. 

The situation confronting us is now best discussed by the clever trick of renaming 
indices. First, understand that Lorentz invariance is broken by the presence of the back¬ 
ground field A to be regarded as given and fixed. (This is the same as in chapter VI.2: the 
presence of a background magnetic field means that parity and time reversal are broken.) 
But now suppose we simply relabel indices and write 

d q uad = -~Ar{ri ab D^a a D p a b - 2iF ab [a a , a b \) (11) 

g 

where ri ah is nothing but the humble Minkowski metric. 

The first term by itself enjoys a hidden “enhanced Lorentz” symmetry: an 5(9(3, 1) 
transformation on the indices a, b leaves the Lagrangian invariant. We now exploit this 
hidden symmetry. Since the leading behavior of A4 ab for large z comes from repeated 
insertion of the cubic vertex rj“ h a cl A p d^a b contained in the first term in (11), we conclude 
that the leading behavior must be proportional to if b . 

In contrast, with one insertion of the quartic vertex from the second term in (11), we 
decrease the power of z by one, since it does not contain a derivative on the field a. But 
we also break the hidden “enhanced Lorentz” symmetry, since F ab is fixed. On the other 
hand, there is an extra bit of information: we know that it is antisymmetric in ( ab ). [Note 
that an insertion of the quartic vertex if b a a A p A^ L a b contained in the first term in (11) also 
decreases the power of z by one, but its contribution is proportional to r] ah .] 

Thus the hidden “enhanced Lorentz” symmetry tells us that the amplitude expanded in 
powers of z must have the form 

M ab = (cz + • • • )tf b + A ab + -B ab + • • • (12) 

Z 

with c some unknown constant. The only thing we know about the matrix A ab is that it is 
antisymmetric in (ab). (I am following the notation in the literature. If you are confused 
between this matrix A and the background gauge potential A(x) you need to go back to 
square 1.) 
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We still have gauge invariance in the form p ra (z)M ab (z)e sb (z) = 0 and 
e ra (z)A4 ab (z)p sb (z) — 0, giving us valuable information. For example, looking up the form 
p r (z ) = p r + zq, we obtain q a M ab (z)s sb (z) — —(l/z)p ra M ah (z)s sb (z), but since from (5) 
e“(z) = q, this means that e~(z)M ab (z)e sb (z) = -(1/z)p ra M ab (z)e sb (z). 

Let us now look at the specific helicity combinations for which we had naive expectations 
in (7). Recall that we expected Ad _+ (z) —» z. In fact, since c+(z) = q and p r ■ q — 0, we have 

M~ + (z) = e -(z)M“ b (z)e+(z) = --_pra{(cz + • • •) q ab + A ab + -B ab + • • -}q b 

—- PraA ah q b + O(-) -*■ - (13) 

z z z z 

This amplitude behaves better than naive expectation by two powers of (1/z)! 

Next, Ad (z) —>■ z 2 naively, but in fact 

M—(z) = e~(z)M ab ( z)e-(z) = - 1 p ra {(cz + ■ ■ -)q ab + A ab + -B“ b + • • -}(q* b - zp rb ) 

z z 

= --(p ra A ab q* b + PraB^Prb ) + 0{~) -► (14) 

Z Z z z 

three powers better than naive expectation. Similarly, A 4 ++ (z) —> 1/z. Note that these 
conclusions hold for any n. If you have the strength, you might want to witness the 
cancellations by explicitly calculating the various Ad’s for low values of n. 

But not all helicity amplitudes behave better than naive expectation. We finally come 
to Ad" 1 (z), which -» z 3 naively. Looking at (5) and (6) we already see trouble, since both 
e+(z) and s~ h (z) grow like z. Now we have 

M+~(z) = e +(z)M ab (z)e~(z) = (q* + zp sa ){(cz + • • -)q ab + A ab + -_B“ b + ■ ■ ■} (q* b - zp rb ) 

= -cp s ■ p r z 3 + 0(z 2 )^ z i (15) 

Incidentally, note that our intuition about complex momenta is a bit shaky. The helicity- 
conserving amplitude (H—>• +) [namely Ad" 1 (z) by crossing; recall that Ad was defined 
with all momenta going in] behaves worse than the (H—>■ —) amplitude Ad ++ (z), the 

(->■ +) amplitude Ad (z), and the (- >■ —) amplitude Ad h (z). The polarization 

vectors are continued for complex momentum in a nonsymmetric fashion. 

Confusio suddenly speaks up! “You haven’t yet exploited the gauge invariance of the 
background field,” he says. 

We forgot that he often appears in the company of SE. Indeed, he is right. Very good— 
Confusio did not become an assistant professor for nothing. 

Indeed, let us look at the cubic vertex in figure N.3.2a more carefully: we have a hard 
gluon carrying momentum zq + ■ ■ ■ scattering off a background gluon carrying some 
small momentum p into a hard gluon with momentum zq + ■ ■ ■■ The coupling comes 
from the term tr3 At a v [A /i , a v ] in the Lagrangian, and thus to leading order in z the vertex is 
proportional to zq hL ■ A^(p). According to exercise VII.1.1, we can choose a gauge in which 
q fl ■ A^ip) — A 2+ ii(p) — 0, known as the Chalmers-Siegel space cone gauge. 

We should check to see if this is possible, but to streamline the exposition let us 
merely do the abelian case. With A^(x) -> A b (x) — 3 |U A(x), the desired gauge choice 
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requires q ■ A(p) — iq ■ pA(p), and thus we can solve for A (p) as long as q ■ p ^ 0. While 
q ■ p r s — 0 by construction, generically there is no reason for q ■ p t to vanish for i r, s. 
So we conclude that indeed we can get rid of the offending cubic vertex. 

But not so fast! What about figure N.3.2c, in which all the soft gluons interact with 
each other to form one single soft gluon carrying momentum s p t — —{p r + p s )? 

Since q ■ (p r + p s ) — 0, we cannot set q ■ A(p r + p s ) — 0 and the diagram in figure N.3.2c 
remains. Thus, even though we managed to get rid of the cubic vertices in figure N.3.2a, b, 
our previous conclusion about the large-z behavior of Ad still stands. 

“Wait! What about the color factor?” Confusio yells. Let us look at the color structure 
we stripped off. From figure IV.5.2b we see that the cubic vertex in figure N.3.2c requires 
that the two hard gluons be adjacent in color. It is easiest to explain the terminology by 
an example: a red-green gluon and a blue-yellow gluon are not adjacent in color, but they 
are both adjacent to a red-yellow gluon (and to a blue-green gluon). Note that the coupling 
trF ,iv [a /J , a v ] also requires that the two hard gluons be color adjacent. 

Thus, if the two hard gluons are not color adjacent, the large-z behavior of Ad is 
somewhat better, since now c = 0 and A nb = <9 (1/z). Then Ad v —> 1/z 2 instead of 1/z, 
Ad" 1 -> z 2 instead of z 3 , while Ad and Ad ++ are not improved. Confusio deserves credit 
for his partial triumph, and perhaps eventually should be given tenure. 

The bottom line is that, contrary to naive expectation, amplitudes in gauge theory behave 
well enough for the BCFW recursion program to work. We don’t even mind that Ad" 1 
behaves badly; it suffices for the program that Ad - ’ an Y felicity van i s h es for large z. In 
particular, in appendix 1 we will show how to complete the calculation started in the 
preceding chapter. 

As indicated earlier, once we determine the tree amplitudes, we can in principle obtain 
all loop amplitudes by using unitarity. In this modern revival of the S-matrix spirit, we deal 
with only on-shell amplitudes. The message here is that traditional Feynman diagrams 
carry around an enormous amount of unnecessary off-shell baggage. A dramatic example 
is furnished by this innocuous looking Feynman integral 

r d 4 i m v m x 

J (2n) 4 I 2 (l - k) 2 (l - p) 2 (l - q) 2 

which you can evaluate most conveniently using dimensional regularization. Try it. The in¬ 
tegral looks similar to the integrals we did back in chapters 111.6,7, but looks are deceptive. 
The answer, if printed on a page, is a total black smudge (see http://online.lcitp.ucsb.edu/ 
online/colloq/bern2/oh/05.html). After all, this integral is just one piece of a physical am¬ 
plitude and by itself does not possess any nice qualities, such as gauge invariance. 


All possible Lorentz invariant theories 

Remarkably, not only does BCFW recursion allow us to determine all n -point on-shell 
amplitudes in terms of a primitive 3-point on-shell amplitude, it also restricts all possible 
theories for which the recursion works. Let us sketch how this is possible. We anticipate 
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here, as we will explain in the next chapter, that the recursion program works for massless 
spin 2 as well as for spin 1 particles. Consider a 4-point on-shell amplitude AT The point is 
that we are free to deform different pairs (r, s ) to determine AT Suppose we pick (r, 5) = 
(1, 4). Then Adis the sum of two pieces, one with a pole in s = (p 1 + p 2 ) 2 — (p 3 + p 4 ) 2 and 
another with a pole in t — (p 4 + p 2 ) 2 = (p 2 + p 4 ) 2 . But we could have also picked (1, 2) for 
example. That the physical 4-point on-shell amplitude A 4(z = 0) constructed in different 
ways must agree imposes powerful self-consistency conditions on the primitive 3-point 
on-shell amplitude. 

Perhaps not surprisingly, for spin 2 massless particles, Einstein gravity is the only 
possible theory, while for spin 1 massless particles, Yang-Mills gauge theory. Indeed, this 
result was proven long ago by Weinberg using rather general arguments. But it is still 
instructive to see how the same result emerges from a strikingly different formalism. 
These self-consistency conditions also allow one to explore and search for other possible 
theories. 

It is crucial that the primitive 3-point on-shell amplitude A4 3 is evaluated for complex 
momenta, which allow more freedom than garden-variety everyday real lightlike momenta. 
(As already noted in appendix 1 to chapter N.2, the Yang-Mills cubic vertex vanishes 
for real lightlike momenta.) I remind you again that for complex momenta p t = A,-A.,-, 
the two spinors A,- and A,- are independent of each other. Recall from chapter N.2 that 
( ij) = (Wj) = £ afl h a ^jf) and i'j] - [V.,] = [V./] - £ a ^ia^jp- Also, Pi ■ Pj = (ij)[ij\ 

The on-mass shell conditions p t ■ pj = 0 then become (12)[12] = 0, (23}[23] = 0, and 
(31)[31] = 0. Apparently there are several possible solutions. For example, we could have 
all three square brackets vanish with all three angled brackets nonzero, or we could have 
two square brackets vanish, say [12] = [23] = 0, with (31) = 0. But there are only two 
independent 2-component spinors, so three spinors cannot be linearly independent [take 
A 3 oc (0, 1) and A 2 oc (1, w), then the third spinor A 3 is necessarily a linear combination 
of the other two]. Thus, [12] = 0 and [23] = 0 mean that A 1 oc A 2 and A 2 oc A 3 , respectively, 
which implies that A 3 oc A 3 and [31] = 0. Of course, the discussion can be repeated with 
square and angled brackets interchanged. Thus we conclude that 


either (12) = (23) = (31) = 0 or [12] = [23] = [31] = 0 


(17) 


(For example, if [12] = [23] = [31] = 0, then A 2 = a 2 Aj and A 3 = a 3 Aj, and momentum 
conservation JT p, = JT A,-A,■ = 0 implies Aj + a 2 A 2 + a 3 A 3 = 0. The information here 
is in the coefficients, since three 2-component spinors are always linearly dependent.) 

Thus, depending on the helicities, either Al 3 = (23), (31)) or A4 3 = 

M a ([ 12], [23], [31]). 

Recall from the preceding chapter that A,- = —2 h j7 where A, counts the powers of A,- 
minus the powers of A,-. But acting on A 4 H , this just counts the powers of AWrite 
M H — ( 12 )T( 23 )T( 3 l)T anc [ so lve for the unknown d’s using Aj = d 2 + c/ 3 = — 2/q, etc. 
Then d\ — h 2 — h 2 — /; 3 , d 2 = h 2 — /? 3 — h\, and d 3 = h 2 — hi — h 2 . For example, suppose 
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the theory contains spin 1 massless particles, with different varieties labeled by an index 
a whose range we need not specify. Then we have, for example. 


’ 2 b • 3 c ) fabc ( ^ 23 ){ 31 ) ) 


(18) 


since the helicities hi — h 2 — —1 and h 3 = +1 imply that di — d 2 — —1 and d 2 = 3. At this 
stage f abc is some unknown coefficient that depends on the particle variety. As required, 
we have two positive powers of A-, and X 2 and two negative powers of A. 3 . This confirms 
what we obtained in appendix 1 to the preceding chapter. 

Several remarks follow. 


1. From pi = X i X i the spinors X and X have mass dimension ^. Thus M 3 has mass dimension 1, 
as expected (recall that the cubic coupling in gauge theory has the form ~e • ee ■ p). 

2. We obtain the 3-point amplitude 


-^ 3 ( 1 ^ 2 + 37) = f abc 


[12f 

[23][31] 


(19) 


by flipping helicities, which, as we have learned in the preceding chapter, amounts to 
interchanging the roles played by X and X, so that it given by square instead of angled 
brackets. 

3. Note the power of the spinor helicity formalism. We can immediately generalize to higher 
integer spin 5 by scaling the A, ’s and hence the d’s up by a factor of s. Thus we simply raise 
the round parenthesis in _A4 3 (1“, 27, 3+) to power s. The cubic vertex for spin 2 is thus 
given by Af 3 (1--, 2“ 3++) = / a/ , £ .«12) 3 /«23}<31))) 2 . 

4. Interchanging 1 and 2, we see that f abc = — f bac for s odd. Thus for j odd (.s- = 1, for example), 
we cannot have a theory with only one variety of particles. We are compelled to introduce 
the index a (and call it color!). 

5. For s even (s — 2, for example) we can get away with only one variety. Call it the graviton. The 
coefficient f abc can be omitted and one of the two basic cubic vertices for Einstein gravity 
is simply given by 


Af 3 (l , 2 , 3 ++ ) = 


( (i2> 3 y 

V <23)<31) / 


( 20 ) 


(The other vertex is of course obtained by replacing angled brackets by square brackets.) 
More on the 3-graviton vertex in appendix 2. 

6 . Check out the power of the self-consistency argument sketched above. Consider the 4-point 
amplitude M(l a , 2 b , 3 t ., 4 d ) in a theory with a variety of spin 1 massless particles. Apply 
the recursion to construct A4 as the sum of an amplitude with an s channel pole, evidently 
proportional to f abe f c d e with an implicit sum over the label e of the intermediate particle, 
and an amplitude with a t channel pole proportional to f aC efbde- Requiring M constructed 
with different choices of (r, s) in the recursion to be the same then gives the constraint 


fabefcde “F facefbde “F fadefbce ^ 


( 21 ) 
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But we recognize this as just the defining relation (B.19) for the generators of a Lie algebra 
[T a , T b \ = if a bcT c written out in the adjoint representation! The coefficients / aic that appear 
in the primitive 3-point on-shell amplitude are the structure constants of the algebra. If this 
is too abstract for you, verify it for 517(2). 

The recursion program produces Einstein gravity and Yang-Mills theory as the unique 
low energy theory for massless spin 2 and spin 1 particles, respectively, with sufficiently 
good large-z behavior for the recursion relations to be valid. Of course, we also know that 
in the Lagrangian formalism, the powerful constraints of local coordinate invariance and 
local gauge invariance fix the actions for Einstein gravity and Yang-Mills completely. 


Appendix ^ 


Here, as promised, we use the recursion approach to prove the result conjectured in the preceding chapter, that 
for n -gluon scattering, the maximal helicity-violating amplitude is given by 


A(l+,2+, •••,«-) = 


_< 7 ^_ 

(12)(23)(34) •■•((« — l)n){nl) 


( 22 ) 


(Using the cyclicity of the amplitude, we have with no loss of generality let gluon n carry negative helicity.) 

We take r = n and s = 1, and deform X n —> X n + zX 1 and A-! — X 1 — zX n (leaving X n and A-^ unchanged), in 
other words, p n + zq and Pi~> P\ — zq with q = A n X 1 . Write (3) as (see fig. N.3.3) 


A(l+, 2+ • • • r, • ■ • , »-) = A 3 (i+, 2+ K-)A„_ 1 (-K+, 3 +,•••, r, •••, n~)/P L { 0) 2 (23) 


Here we define K(z) = P L (z) to simplify writing. We use a hat to indicate that the corresponding momentum 
has been complexified. Thus 1, h, K remind us that p\(z), p n (z), and K (z) = — (pi(z) + p 2 ) (evaluated atz = z L ) 
are the three complex momenta in the problem. 

In the spirit of recursion, we are supposing that At, and A n _\ are given by (22) (and the corresponding 
expression with all helicities flipped and angled brackets replaced by square brackets). Note that (22) does not 
refer to any possible relation between the untwiddled A. and twiddled A spinors and thus makes sense for both 
complex and real momenta. Notice that the sum over partition L and over h in (3) collapses to one term in (23). 
We have used the result A(+ + •••+) = 0 and A {—f •••+) = 0, and the second half of (17) to eliminate a 
diagram similar to that in fig. N.3.3 but with particles (n — 1) and n participating in the cubic vertex instead of 1 
and 2. 
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Recursing and using P L ( 0) 2 = 2p 1 ■ p 2 = 2(12)[12], we obtain [see (19)] 


= a . ^ ^ ( 24 ) 

[2, K\K, 1] <K3)<34) • • • ((n - 1 )h)(nK) <12>[12] 

As before, we suppress overall constants. 

The trick consists of taking various hats off or leaving them on. Since is unchanged, we can remove the 
hat on 1 when it appears in a square bracket. Similarly, since X n is unchanged, we can remove the hat on h when 
it appears in an angled bracket. On the other hand, we should leave the hat on K. Instead, we use momentum 
conservation —kjc^K = + ^ 2^2 so that (K , 3 )[K, 1] = —(3, K)[K , 1] = (32)[21] since [11] = 0. (You might 

note that this is the same sort of manipulation used to derive the first identity we needed to massage A 4 into 
shape in the preceding chapter.) Similarly, ( nK)[2 , K] = —(nK)[K, 2] = (wl)[12]. 

Doing all this to (24) we obtain 


A(i+, 2+,... r, 


A(l+,2+, ■■■]-, •••,«“) = 


_ [12 ?(jn) 4 _ 

(12)[12](nl)[12](32)[21](34) • • • ((« - l)n> 


__ 

(12)(23)(34) •••((« — l)«)(nl) ’ 


(25) 


precisely the conjectured result. 

Note how much more powerful the recursion approach is compared to the explicit spinor helicity calculation 
we did to obtain A (1, 2, 3, 4) in the preceding chapter, which in turn is so much more powerful than the traditional 
Feynman diagram calculation. Thus theoretical physics marches on. 

You might be puzzled that the quartic vertex (figure IV.5.1c) of Yang-Mills theory is not needed in the recursion 
program. Does this nonparticipation in the program mean that we can multiply the quartic term in the Lagrangian 
by an arbitrary coefficient (including 0) ? The resolution of this apparent paradox can be traced to the fact that 
the (perturbative) physical states of the gluon are built into the recursion relations. The quartic term is needed 
to guarantee gauge invariance and hence the two helicity states of the gluon. 


Appendix 2 


By showing you the mess in figure N.2.2 I have already plenty impressed upon you that the traditional Feynman 
diagram approach is almost hopeless when it comes to gluons. The situation with gravity is far worse. Consider 
the 3-graviton vertex. Conceptually it is easy to understand: we write g^ v = + h^ v and expand the Einstein- 

Hilbert action (VIII.1.1) to 0(/z 3 ). There it is, with indices suppressed, the cubic term hdhdh in (VIII.1.5). Of 
course, this actually represents many terms with the eight indices contracted every which way, but which you 
can readily work out. Next, pick the harmonic gauge for example, and derive the Feynman rule for the 3-graviton 
vertex Giia,vp,<Ty(Pi> Pi » Pi)> namely the analog of the 3-gluon vertex in (C.18). Each of the three gravitons, say 
the one carrying momentum pi, can be created by any one of the three h’s in hdhdh, and thus many terms are 
generated simply by permuting. The two derivatives give two powers of momentum. Thus, a typical term has 
the form p v p 2l ,V a vVc,y 

Keep working! In all, v p t(Ty (Pi, P2> Pi) contains about 100 terms. Now imagine calculating the one-loop 

contribution to graviton-graviton scattering. You get the point. 

By now, you fully appreciate that the traditional Feynman approach carries an enormous amount of unneces¬ 
sary off-shell information. Already, if we put pi, p 2 , and /? 3 on shell and contract G^^^yipi, p 2 , Pi) with the 
polarization vectors , and y , the 3-graviton vertex simplifies enormously to 

G(Pi, p 2 . Pi) = V (p la i)^ v + cyclic) {p lY t} af> + cyclic) (26) 

Quite naturally, we can write the polarization vector for a spin 2 massless particle in terms of the polarization 
vector for a spin 1 massless particle: €^ a (p) = e tJ/ (p)€ a (p). This form satisfies all that is required of a polarization 
vector for spin 2: €^ a {p)p^ = 0, = e atl (p), and = 0. Thus, indeed, the 3-graviton vertex 

G(pi, p 2 , p 3 ) = + cyclic)] 2 is the square of the 3-gluon vertex (N.2.23), in confirmation of (20), 

which of course is just the same statement couched in another notation. 



Exercises 


N.3.1 Show that the structure of Lie algebra (21) emerges naturally. 

N.3.2 In appendix 1 we recursed by complexifying the momenta of two external lines with helicity + and —. 
In the derivation of the recursion relation (3) we could have picked any two external lines to complexify. 
Determine the amplitude calculated directly in chapter N.2, namely A( 1~, 2~, 3 + , 4 + ), by complexifying 
lines 1 and 2. This is an example of the self-consistency argument sketched in the text. 

Particle physics experimentalists are fond of saying that yesterday’s spectacular discovery is today’s 
calibration and tomorrow’s annoying background. The canonical example is the Nobel-winning discovery 
of the CP -violating decay of the K L meson into two pions. In theoretical physics, yesterday’s discovery is 
today’s homework exercise and tomorrow’s trivium. 

N.3.3 Using the explicit forms given for A{ 1~, 2 ~, 3 + , 4 + ) and A{ 1 _ , 2 + , 3 _ , 4 + ) in the preceding chapter, 
check the estimated large z behavior in (13-15). 

N.3.4 Worry about the sloppy handling of factors of 2 in appendix 1. [Hint: The final result is correct because 
the polarization vectors in (5-6) are normalized to |e| 2 = 2 for convenience.] 



N A Is Einstein Gravity Secretly 

•*T the Square of Yang-Mills Theory? 


Gravity and gauge theory 

Quantum gravity has baffled generations of theoretical physicists, as you have no doubt 
heard. One aspect of this puzzle is the relationship between gravity and gauge theory, 
which describes the other fundamental interactions. While gravity and gauge theory are 
both born of local invariance, the Einstein-Hilbert action / d 4 x^/—gR and the Yang-Mills 
action / d 4 xtr(F llv F llv ) look completely different. 

Perturbatively, gravity is afflicted with an infinite number of interaction terms, as was 
explained in chapter VIII.l, and hence gravity is not renormalizable, in stark contrast to 
gauge theory. On the other hand, the two field theories enjoy many conceptual similarities 
between them. Yang-Mills theory is the unique low energy effective theory of a spin 1 
massless field, just as Einstein gravity is the unique low energy effective theory of a spin 2 
massless field. 

String theory unifies gravity and gauge theory. This remarkable fact alone points to a 
deep connection between gravity and gauge theory, even though within field theory the 
connection is totally obscure. One important clue is that the oscillator spectrum of the open 
string contains only the gauge field but not the graviton, which appears in the spectrum 
of the closed string. However, the closed string spectrum could be described as two copies 
of an open string spectrum, thus leading Kawai, Lewellen, and Tye to discover relations 
between graviton scattering and gauge boson scattering. 1 In the limit of the string energy 
scale going to infinity, we know that string theory reduces to field theory and thus a shadow 
of these KLT relations should survive in field theory. (As you might know, not all theorists 
are convinced that string theory corresponds to reality. If string theory eventually fails, 
its ultimate value might well turn out to be the light it sheds on the hidden structure of 
quantum field theory.) 

1 It is definitely beyond the scope of this book to explain these statements. See, for example, J. Polchinski, 
String Theory, p. 27. 
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In any case, the bottom line is that string theory strongly hints that graviton ampli¬ 
tudes can be expressed as products of Yang-Mills amplitudes, schematically Ad gravitons ~ 
gauge x gauge - The ^ rst reaction of many theoretical physicists when first told this 
is puzzled skepticism. How is this possible, they ask quite reasonably, since Yang-Mills 
contains an internal symmetry group while gravity doesn’t? 

Now that we have learned to strip color, a connection between amplitudes no longer 
strikes us as so implausible, particularly if we stick to on-shell scattering amplitudes for 
gluons in specified polarization states , namely amplitudes that experimentalists 

can measure, rather than amplitudes carrying Lorentz indices that theorists 

using traditional methods play with. As we saw in the preceding chapter, the color stripped 
tree-level on-shell helicity amplitude for gauge boson scattering boils down to the (• • •} 
and [• • •] products of two component spinors. We did not do the analogous calculation of 
the tree-level on-shell helicity amplitude for graviton scattering, but we could anticipate 
that the result would again be expressed in terms of the (• • •} and [• • ■] products. The 
spinor helicity formalism is intrinsic to the Lorentz group SO( 3, 1), not tied to a specific 
theory. In particular, the interaction vertices in Einstein gravity are again given in terms 
of scalar products of momenta and polarization vectors. Quite suggestively, the graviton 
polarization vectors can be written, as mentioned in appendix 1 to the preceding chapter, 
as e^ v = e^e 1 ’, a product of the gauge theory polarization vectors. 

Indeed, I have already given part of the mystery away in the preceding chapter. We saw 
that the basic cubic interaction vertex of three gravitons (with complex momenta) is given 
by the square of the corresponding quantity for three gluons. 

In summary, thanks to our string theory friends, we now know that there exists a 
secret structural connection between gravity and gauge theory that is totally opaque at 
the Lagrangian level. 


Deformed graviton polarizations 

In this closing chapter, I give a brief introduction to the exciting quest for this secret 
connection. I will be content to look at one specific calculation. 

Go back to the BCFW recursion (chapter N.3). It would work for gravity if the com¬ 
plexified scattering amplitude A i(z) vanishes as z —*■ oo. But naively, it would seem that 
the situation for gravity is even worse than the situation for gauge theory, since the cubic 
graviton vertex is quadratic in momentum and thus goes like z 2 . (Recall the two powers 
of derivative in the scalar curvature; see chapter VIII.1.) Repeat the calculation in the pre¬ 
ceding chapter for n-graviton on shell scattering. Go back to figure N.3.2 and interpret the 
lines as gravitons. The (n — 2) cubic vertices give a factor of z 2( ” _2) for large z, easily over¬ 
whelming the factor of 1/z" -3 from the (n — 3) propagators. This nasty behavior occurs 
even before we include the polarization of the two hard gravitons. 

The graviton carries helicity ±2 (appendix 2 of chapter VIII.l) and hence a polarization 
“vector” e ^ v , given by a symmetric and traceless tensor. We can naturally construct e /xv = 



N- 4 - Gravity and Yang-Mills Theory | 515 


€^e v out of the polarization vectors for a massless spin 1 particle (as already explained in 
the preceding chapter). Thus, after deformation, 

e+ + ^\z) = e+»e+ v = (.q * + z p s T(q* + zp s )\ 6 r ~ "’’(z) = cjT'V = <?V (1) 

and 


e ++^(z) = e+^e+ v = q»q\ (z) = eT'V = ( 9 * - zp^fo* - zp r ) v 


( 2 ) 


Note that the e^ v (z)’s are in fact traceless and could go as either z° or z 2 for large z. 
Putting it together, we obtain the naive estimate 

,2(n-2) 


M~~' ++ (z) - 

naivp v 2 


M —,— or++,++( ) 
^ naive v ' 


_n+l 


Ad 


++.— 


(z) ■ 


71 + 3 


(3) 


The escalating behavior as n increases is the hallmark of a nonrenormalizable theory, as 
explained in chapter III.2. 


Hard graviton in a soft spacetime 

Once again, we hope that real life is cushier than naive expectation. By the same reasoning 
used for gauge theory, we study a hard graviton blasting through a gravitational field, that 
is, a background of soft gravitons. So write the metric of spacetime as Q pv = g /JV + h pv . 
Plug this into the Einstein-Hilbert action (VIII.1.1) and extract the terms quadratic in h. 
While the calculation is straightforward, it does involve some heavy lifting. To avoid the 
labor, we note that using the harmonic gauge, we did this calculation in (VIII.1.10) but 
only for the special case g /lv = q fLV (in other words, we expanded around flat Minkowski 
spacetime rather than a general curved spacetime). We had 

c =-i-(» - V’vw ( 4 > 

647 t(j Z 

with the trace degree of freedom h = Henceforth, we set 64 ?xG — 1. 

Armed with symmetry considerations and our knowledge of gravity (chapter VIII. 1), we 
can almost immediately guess that when we go from a flat q /lv to a curved g v background, 
this quadratic Lagrangian generalizes to 

£ = V=g(g lxv g kp g° T D lx h Xa D v h pr - ^ v D ll hD v h - 2 R^h Xa h pz ) ( 5 ) 

with h now defined as h = g llv h llv . Here D denotes the covariant derivative with respect 
to the curved metric g p ,, introduced in chapter VIII.1 and R a p° t the Riemann curvature 
tensor constructed out of g /lv . I trust you not to confuse this D associated with the 
curved background with the covariant derivative in Yang-Mills theory used in the preceding 
chapter and mentioned below in passing. 

Let us go through the various features of (5). The ~J—g goes with the spacetime volume 
and is common to any Lagrangian in curved spacetime, as we learned way back in (1.11.2). 
We also learned there to promote any Lagrangian from flat to curved spacetime by replacing 
q llv with g^y and the ordinary derivative by a covariant derivative (see chapter IV.5). As you 
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can see, everything pretty much works out in parallel with how things work out for gauge 
theory. The new feature is the term involving the Riemann curvature tensor R kpaz , which 
vanishes upon restriction to flat spacetime. But you are not surprised that such a term 
could pop up, given a y ] in (N.3.8). Indeed, the only thing we can’t determine 

without doing the actual calculation is the numerical coefficient (—2) of this term. That 
particular number will play no role in the following discussion. 

Also, in the preceding chapter we dropped a term linear in a^ because of the equation 
of motion D /t F pv — 0. Here, analogously, we dropped a term linear in h /lv because of 
Einstein’s equation of motion R /lv — 0. This also explains why terms involving the Ricci 
tensor R and the scalar curvature R do not appear in (5). 

We want to calculate the large-z behavior of various scattering amplitudes A4 ,++ , 
etc., and compare with the naive expectation (3). We hope that the same trick we used 
for the gauge theory case would also work for gravity. Now the string theory hint, that 

gravitons ~ -^dgauge x -^dgauge’ suggests a factorized structure in the graviton amplitude, 
and so more or less naturally leads to the guess that the first index X and the second index 
a of h Xa are somehow associated respectively with the two copies of M gauge . 

The key to breaking the problem apart is the Bern transformation unlinking these two 
indices. Examining (5), we see that the only term that links the first index with the second 
index of h Xa appears in the term g^D^hD v h, since h — g pv h^ v does precisely that. How 
to get rid of this term? The trick, following Bern and Grant, is to introduce a scalar field 0 
and add the term 2g ,lv 3 IL 4id v 4>. We are allowed to do this, since 0 does not appear in the tree 
level graviton scattering amplitude we are studying. (Of course, the theory is changed from 
pure gravity, and 0 does circulate in loop diagrams for graviton scattering. Some readers 
may also know that in string theory the graviton appears with a scalar 0, the experimentally 
unobserved dilaton.) 

For pedagogical clarity in explaining what we are going to do next, it is best to retreat 
to the case of the flat background. Focus on the parenthesis in (4), now modified to 
Cd ll h krr d l ‘h kr! — ^d^hd p h + 23 /t 03 , '0). (The normalization of 0 is, in this context, just 
chosen for convenience.) Since we can always make a field redefinition (see the appendix to 
chapter VIII.3) without affecting on-shell scattering amplitudes, we let h Xa —> h Xa + r] Xiy <p 
(and hence h —*■ h + 40) and 0 -></>+ \h. You can verify that our parenthesis changes to 
(‘d l ih Aa d ll h ka — 23^03^0). Since in this manipulation the role of r] x „ is merely to convert 
h Xa into h, the same transformation works when rj Xa is promoted to g Xa . 

The upshot is that we can effectively rewrite (5) as 

c = - 2 R kpax h Xa h px ) (6) 

Now that 0 has done his job, we have unceremoniously thrown him out since he doesn’t 
contribute to the on-shell tree amplitudes we are interested in. We have thus dropped the 
term 3^03^0. 

There has been quite a bit of formal development and perhaps the reader has lost sight 
of what we are trying to do. Recall that we want to study the amplitude of a hard graviton 
blasting through spacetime. Although a multitude of indices have appeared, as is always 
the case with gravity, you should recognize that this Lagrangian is conceptually simple: it 



N- 4 - Gravity and Yang-Mills Theory | 517 


is quadratic in the quantum field h describing the hard graviton and contains some given 
c-number tensors g Xp (x) and R Apaz (x) pertaining to the background. 


Unlinked melody 

The important point is that the two indices carried by h Xa are now unlinked from each 
other in the first term in (6). In chapter VIII.1, we learned to trade a “world index” like 
X for a locally flat Lorentz index a by using the vierbein e^(x). Here we are invited to 
introduce two sets of vierbein, e and e, with their associated connections &> and a>, and 
write h Xa = e x e“h a5 . In reality, of course e = e and a> — a>, but this notation keeps track of 
the fact that the two sets of indices carried by h Xa are unlinked. Note that h Xa is treated in 
our quadratic Lagrangian as just some tensor field living in a curved spacetime specified 
by gxa = e a x e a a r] a - a . 

Also in chapter VIII.1 we emphasized that the covariant derivative acting on vectors 
carrying a world index and on vectors carrying a locally flat Lorentz index assumes different 
forms, D IX V v = 9 ;i V l; - T x v V x and V^V a = d^V a - co^V,,, respectively. For pedagogical 
clarity, I will use two different symbols D and V to denote what is conceptually the same 
operation. For our problem we have D x h^ v = e a ^e a V x h a - a , with V x h a - a = d x h a - a - o)\h b ~ a - 

With this notation, the relevant Lagrangian becomes 

C = KaD v h bi - 2R ab5B h a5 h bi ) (7) 

We are now ready to study the large z behavior of the scattering amplitude of a hard graviton 
carrying momentum zq + ■ ■ ■ blasting through a curved background spacetime g dV . 

The analysis proceeds much as in the Yang-Mills case discussed in the preceding chap¬ 
ter. Focus on the first term: *J=g g pv if b q ab (d a ~ a - af ja h c ~ a - tf .h a ~ c )(d v h hE - co d vb h di - 

S) d -hbj). The leading <9(z 2 ) behavior comes from the piece containing two derivatives in 
the first term, namely A ea d = ^ abp t ab ^^ l aa^J l b b’ an ^ thus contributes to the am¬ 

plitude a term proportional to rj“ b rj ah . In the Yang-Mills case, the Lagrangian contains a 
hidden “enhanced Lorentz” symmetry. Here the situation is even better: we have not one, 
but two hidden “enhanced Lorentz” symmetries. The term £i ea d is evidently left invariant 
by two separate SO( 3, 1) Lorentz transformations, one operating on the a, b indices, the 
other on the a, b indices. 

The subleading O(z) behavior comes from the pieces in the first term containing one 
derivative and one factor of either a> or a>, for example -J—gg ,xua *9 ^ h ab (o) d b h db + 
a/~h h j). In this way we find that M ab,ab -> cz 2 q ab r\ ab + z(q ah A ab + A ah if b ) + • • •, with 
A ab and A ab two matrices antisymmetric in their indices. To see this, consider for example 
the piece involving co (after some relabeling ofindices): «J—gg pv if c V ah (d b h a b)co b c h bb . This 
gives rise to the term A ab rj ab in A4 ab,ab . Note that since the matrix A ab depends on the 
spin connection of b of the background, all we can say is that it is antisymmetric in its two 
indices a b. Recall that this is quite analogous to what we did in the Yang-Mills case. 
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Before reading further, you could now flex your mental muscle and push ahead to 
obtain the sub-subleading 0(z °) behavior. This comes from the pieces in the first term 
containing two factors of u> and a>, for example (again, after some relabeling of indices) 
■sj—gg 11 v ri cd fj ab ofj c hacfi^vdhbly' ^ we can now conclude is that this contributes to the scat¬ 
tering amplitude a term of the form B ab fj ab , with B ab an arbitrary matrix. To this order, 
the second term in (7) also contributes, breaking the “enhanced Lorentz” symmetries com¬ 
pletely. Nevertheless, we can still exploit the known symmetry properties of the Riemann 
curvature tensor under interchange of its indices to say something about its contribution 
to the scattering amplitude. Putting it all together we conclude that 

M ab ’~ ab cz 2 r) ab rj Sb + z(if b A“ b + A ab rj 5i ) + A abbb + ( i) ab B ab + B ab q“ b ) + O 

Compare this with (N.3.12), which states that the amplitude for the scattering of a hard 
gluon off a background of soft gluons goes like M ah — (cz + • • •) q ab + A ab + B ab + • • •. 
Amazingly, you can see that the large-z behavior for the scattering of a hard graviton off a 
background of soft gravitons can be obtained by “squaring” the large-z behavior for the 
scattering of a hard gluon off a background of soft gluons! In other words, J \4“ b,ab ~ 
M. ah JVl ah , as far as the large-z behavior is concerned. 

Just as in the Yang-Mills case, by exploiting gauge identities like p ra (z)M aa,bb e sh j J {z) — 
0, we can determine the large-z behavior of various helicity amplitudes. For your conve¬ 
nience I remind you that (from the preceding chapter) p r (z) — p r + zq and p s (z) = p s — 
zq. Thus the gauge identity just displayed says that q a -M aa ’ bb e sb j;( z ) — 
— (1 /z)p ra M aa,bb e sb j } (z). Recalling [see (1)] that ej ^ v (z) — q^ L q v we see that in calcu¬ 
lating the amplitude M ,h (z) we can effectively replace e~ ^(z) by (1 /z 2 )pjfp b . Thus 
we can immediately conclude, since e+ +/xv (z) ~ z° for large z [see (2], that for example 

M—’ ++ (z) -+ i (9) 

Z z 

which is far better than the naive expectation (z) -* z" _1 . Indeed, the horrible 

ever-escalating behavior with increasing n has disappeared. Even more remarkably, the 
large-z behavior of graviton scattering amplitudes is consistent with the string-inspired 
notion that gravity is “the square of Yang-Mills.” Recall that in gauge theory Jvi *“ (z) -» 1 /z. 
Thus, for large z, indeed A4 ,++ (z) ~ (M. I “(z)) 2 . 

The bottom line here is that the large-z behavior of gravity is surprisingly benign and 
vanishes fast enough for the recursion program to work. 



Gravity is a square? 

So, is Einstein gravity secretly the square of Yang-Mills theory? 

Already we have seen in the preceding chapter, anticipating that the recursion pro¬ 
gram works, that the primitive 3-point amplitude for gravity (for one helicity configura¬ 
tion) ((12) 3 /(23)(31)) 2 is the square of the primitive 3-point amplitude for gauge theory 
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(<12) 3 /(23) (31)), something that you could have never suspected by staring at «J—gR and 
tr F /lv F ,lv untill you are blue in the face. 

The calculations in the previous section show that the large-z behavior for the scattering 
of a hard graviton off a background of soft gravitons could be obtained by “squaring” the 
large z behavior for the scattering of a hard gluon off a background of soft gluons, certainly 
something that nobody could have anticipated by looking at Lagrangians. 

Further evidence that the answer to the title of this chapter is “yes” comes from a recent 
calculation by Bern, Carrasco, and Johansson. Interestingly, they do not strip the color 
from a Yang-Mills theory, but instead show that they can write the “color-dressed” tree 
amplitudes in the form 


*4 tree (l, 2, • • •, ri) = J2 

a 


(n jP ))a 


( 10 ) 


It is beyond the scope of this book to explain in detail how this expression is obtained. I 
merely state that the index a labels an individual diagram. For each diagram, the amplitude 
may be written as the product of a kinematic function n a of the momenta and a color factor 
c a , divided by the product of the momenta p: carried by the internal lines. (I do not explain 
here how n a and c a are defined.) The tree amplitude is then given by a sum over all tree 
diagrams. 

Bern et al. then conjecture that the n -graviton scattering amplitude at the tree level is 
given, amazingly, by 


(1, 2, • • ■ 

gravity v ’ ’ 


,n) = J2 


WjPpa 


( 11 ) 


They have checked by explicit computation that their conjecture in fact holds up to n — 8. 
Furthermore, they have also verified that the conjecture, suitably generalized, also holds 
for the various supercousins of Einstein gravity and Yang-Mills theory. 

Thus the evidence is extremely strong that, yes indeed, Einstein gravity is secretly the 
square of Yang-Mills theory, at least at the level of tree amplitudes. However, as of this 
writing (February 2009), there is no definitive understanding within field theory. The final 
word on the subject has yet to be said, and it is not even clear what the final path to the final 
word might be. I would be foolish indeed to discuss this further in a textbook when the 
entire subject is being rapidly developed. By the time this book is published, the conjecture 
that Einstein gravity is the square of Yang-Mills theory may well have been proved. If not, 
then nothing would please me more than if a reader of this textbook could go on and prove 
it, hopefully not just at the tree level, but to all orders. 


What is the simplest field theory? 

The uninitiated would likely answer <p 4 theory. Indeed, field theory texts almost all start with 
some kind of scalar field theory. Even I am not able to do any better. But the sophisticated, 
namely you, now that you have reached the end of this text, realize that the more symmetry 
the theory has the better. To theoretical physicists, simplicity actually secretly means 



520 | Part N 


symmetry. Incidentally, I have always hated scalar field theories, and have ventured to 
say so publicly. It is hard to like the action C — j (dip) 2 — X(p 4 , so barren of color and flavor. 
Some of the major problems facing particle physics, such as the hierarchy problem, may 
eventually turn out to stem from our not having mastered scalar field theory. 

Of course, scalar field theory is the simplest in the superficial sense that you need to 
know the least to approach it. As I said in chapter 1.12, once one is familiar with scalar 
field theory the rest consists of “merely” decorating the field with various indices describing 
spacetime or internal symmetries. But the symmetry and the resulting structure provide 
us with handles to grab on to. Both Yang-Mills theory and Einstein gravity have an internal 
logic sorely lacking in scalar field theory. As I mentioned in chapters VII.3 and VIII.4, 
the consensus view is that the first exactly soluble field theory would almost certainly be 
Af — 4 supersymmetric Yang-Mills theory, the supercousin of pure Yang-Mills theory. The 
remarkable recent developments described in the last three chapters have only reinforced 
this view. Almost beyond belief, even gravity may be simpler than we had long thought. 
For large complexified momentum, graviton scattering for some helicity arrangements 
actually behaves better than gluon scattering. The evidence is mounting that Einstein 
gravity may in some sense be the square of Yang-Mills theory. So now we are left with the 
amusing thought that the simplest field theory may well end up being gravity or Af — 8 
supergravity with its maximal supersymmetry. (At this point, a friend of mine who works 
with Af — 8 supergravity pipes up, “It sure doesn’t look simpler if you are the guy doing 
the calculation!” It is clear from the simple dimensional argument of chapter III.2 that 
as one goes to higher order, the numerator of the Feynman integrand quickly becomes 
extremely involved.) 

Only time will tell who will win the simplest field theory contest, but we do have two 
convincing candidates. 



More Closing Words 


In the closing words to the first edition of this book, I wrote that Yang-Mills theory 
was almost begging for a better notation that would lay bare the deeper structure of the 
theory. Oy, the excess baggage we have to carry! Ten thousand terms instead of one. In 
some respects, the spinor helicity formalism and the recursion program explained in 
chapters N.2-N.4 provide a partial answer to that pious wish. 

Imagine some theorist idly wondering, after 1865, if there were a better notation to 
describe the six fields E x , E y , E z , B x , B y , B z for which Maxwell had written 20 equations 
(since he did not use vector notation). We can even fantasize that by fooling around with 
numerology (“Look, 4 ■ 3/2 = 6!”), this “crackpot” came up with an antisymmetric 4 by 4 
matrix he called F. Shoehorning Maxwell’s equations in vacuum (some of them stating 
that the time variation of E and B is related to the space variation of E and B) into this 
strange notation, this guy could even stumble on a secret connection between space and 
time. 

The spinor helicity formalism and the recursion program, though elegant, are still 
rooted in the perturbative expansion of the 1940s. Can they be pushed into the nonpertur- 
bative regime? There have been attempts in that direction. 

In all previous revolutions in physics, a formerly cherished concept has to be jettisoned. 
If we are poised before another conceptual shift, something else might have to go. Lorentz 
invariance, perhaps? More likely, we may have to abandon strict locality. Again, in closing 
words I mumble something (from steepest descent to integral to what?) about modifying 
the form of the path integral. The recursion program and the resuscitated S’-matrix ap¬ 
proach might be a step in this direction, formulating field theory while avoiding mention 
of a local Lagrangian. But we need analyticity, and of course analyticity follows from local¬ 
ity and causality, as far as we understand. We know also that even local field theory could 
spawn non-local constructs, most notoriously the horizon of a black hole. But there the 
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dynamics bends the causal structure of spacetime out of whack. The lack of strict locality 
is not built into the laws of physics. 

Of course, we also know how to imbue physics with non-locality right from the start. 
We have Wilson’s lattice formulation of gauge theory, and more recently, Wen’s intriguing 
lattice formulation of gravity. 

When I showed the last three chapters to our friend S E, she mused, after some reflection, 
“Now I see what theorists could always do when in doubt: enhance the symmetry and make 
it local, complexify and bow to Cauchy, and take a square root when possible!” 

I nodded, “These are the three ways of the warrior theorist: I call them the Einstein 
way, the Heisenberg way, and the Dirac way. They were wildly successful in the past, and 
perhaps they will work in the future as well.” 

With this edition of my textbook, I can no doubt count on a new group of readers to 
come up with fresh insights into field theory. As these new chapters suggest, there may 
still be plenty of secret structures to uncover. And thus field theory marches on. 

Finally, I reveal the origin of the quote at the start of the preface to the second edition. As a 
kid, Feynman came across a calculus book 1 that proclaimed “What one fool can do, another 
can.” He was thus inspired to master calculus. Now that you have mastered quantum field 
theory, you can switch from the “understand” in the preface to the “do” in these closing 
words. 


1 Silvanus P. Thompson (1851-1916), Calculus Made Easy, 1910, updated by Martin Gardner, St. Martin’s Press, 
(1998). I am kind of trying to do for quantum field theory what Thompson did for calculus. 



—--p —— Gaussian Integration and the Central 

Appendix A Identity of Quantum Field Theory 


The basic Gaussian: 

/ +°o j 2 

dxe~ 2 ' x = 'Jin 

-oo 

The scaled Gaussian: 

rdxe-w^v 


Moments: 


,i „ v 2 f 2n\ 2 1 


/ dxe~ i ax x a =\ — — (2n - 1)(2 /j - 3) • • • 5 • 3 • 1, n>l 

J—oo \ O / O' 

Gaussian with source: 


inr 2 -i-l Tr ( lit \ 2 _ ,2 


</jteZ , '‘“ 2+iy * = | ) e“' y2/2fl 


/ +oo /*+oo /» +oo 

/ • ■ • / dx\dx 2 ■ ■ ■ dx N e i x ' A ' x + ,J ' x - 

-oo J—oo J—oo 

/ +oo /»+oo /»+oo 

/ •■•/ dx x dx 2 ---dx N e~ ■. 

-oo J—oo J—oo 

In what follows, we omit an overall factor. 

Central identity of quantum field theory: 

J rup e -\‘e K -‘t- v (‘e)+ J -‘t _ e -v(.s/fij) e \j-K- 2 -j 


x-A-x+iJ-x _ ( (2 iri) N \ 2 -(fl2)J-A- x -J 


det[A]/ 


( 1 ) 

( 2 ) 

(3) 

(4) 

(5) 

( 6 ) 

( 7 ) 

( 8 ) 

(9) 
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A trivial variation: 


f D<pe~\< p- k -<p+ j -<p = 6 1 2 j - k ~ 1 - j 

(10) 

Variations: 


J DyeOIVV-K-V+iJ-V - g-amJ-K-'-J 

(11) 

J D<pe' / I'PWKvW+JW'PW] _ gi f d d x\-\j(x)K- 1 J(x)] 

(12) 

J D<pe~ / ddx li^ x '> K> i pM+JWv>(x)] _ g f d d x[ 

(13) 

(where K or AT -1 or both may be nonlocal) 

A specific example: 


j D(pe‘ / dd *lW 2 )<P 1 +<PM] _ e ‘ f d d x{-(l/2X)(M) 2 ] 

(14) 

For K hermitean with (p complex: 


j D<p* 

(15) 


As noted earlier, various numerical factors have been swept under the integration measure. In applying these 
formulas, be sure that these factors are not relevant for your purposes. 



Appendix B 


A Brief Review of Group Theory 


I give here a brief review of the group theory I will need in the text. I assume that you have been exposed to some 
group theory, otherwise this instant review might not be intelligible. Most of the concepts are illustrated with 
examples, and it goes without saying that you should work out all the examples and verify the assertions made 
without proof. 


SO(N ) 


The special orthogonal group S O ( N ) consists of all N by N real matrices O that are orthogonal 

O t O = 1 ( 1 ) 

and have unit determinant 


det 0 = 1 (2) 

We denote the element in the ith row and y'th column by O lJ . The group SO(N ) consists of rotations in N- 
dimensional Euclidean space and its defining or fundamental representation is given by the N component vector 
v = {v J , j = 1, N}, which transforms under the action of the group element O according to (as always, all 

repeated indices are summed over) 

v‘ v n = O'V (3) 


We define tensors as objects that transform as if they are equal to the product of vectors. For example, the tensor 
T i jk transforms according to 


j*ijk _^ ji/ijk _ Qil Qjm Qknjlmn 


( 4 ) 


as if it is equal to the product v l v^v k . The emphasis is on the phrase “as if": T l ^ k is not to be thought of as being 
equal to v l v^v k . 

It is important to develop some “feel” or intuition for groups and their representations. Some people find 
it helpful to picture a certain number of objects being acted upon by the group and transformed into linear 
combinations of each other. Thus, picture T^ k as N 3 objects being scrambled together. 

Tensors furnish representations of the group. In our particular example, each group element is represented 
by an N 3 by TV 3 matrix acting on the N 3 objects T l * k . The number of objects in a tensor is called the dimension 
of the representation. 

It may well be that any given object in a representation does not transform, under all the elements of the group, 
into a linear combination of all the other objects, but only into a subset of them. Let me illustrate with an example. 
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Consider T'i —> T n i = O l1 o^ m T lm . Form the symmetric S l * = \ {T l i + T^ 1 ) and antisymmetric combinations 
A 1 * = \ (T l i — The symmetric combination S l i transforms into O l1 0^ m S lm , which is obviously symmetric. 
Similarly, A l i transforms into O l1 0^ m A lm , which is obviously antisymmetric . In other words, the set of N 2 
objects contained in T l * split into two sets: \ N(N + 1) objects contained in S l -> and \ N{N — 1) objects contained 
in A lJ . The S lJ ’s transform among themselves and the A lJ> s transform among themselves. 

The representation furnished by T lJ is said to be reducible: It breaks apart into two representations. Obviously, 
representations that do not break apart are called irreducible. 

We just exploited the obvious fact that the symmetry properties of a tensor under permutation of its indices is 
not changed by the group transformation, namely that the indices on a tensor transform independently, as in (4). 
The various possible symmetry properties may be classified with Young tableaux, which is useful in a general 
treatment of group theory. Fortunately, in the field theory literature one rarely encounters a tensor with such 
complex symmetry properties that one has to learn about Young tableaux. 

Another way of saying this is that we can restrict our attention to tensors with definite symmetry properties 
under permutation of their indices. In our specific example, we can always take T l i to be either symmetric or 
antisymmetric under the exchange of i and j . 

We have yet to use the properties (1) and (2). Given a symmetric tensor T l i consider the combination 
T = 8 ij T l K known as the trace. Then T -> 8 ij T^ = 8^ O il O jm T lm = (0 T ) li S ij O jm T lm = 8 lm T lm = T, where 
we used (1). In other words, T transforms into itself. We can subtract the trace from T lJ forming the traceless 
tensor Q l i = T l i — (1/N)8 l ->T. The ±N(N + 1) — 1 objects contained in Q^ transform among themselves. 

To summarize, given two vectors v and w , we can form a tensor, and decompose the tensor into a symmetric 
traceless combination, a trace, and an antisymmetric tensor. This process is written as 

N ® N = [jN(N + 1) — 1] © 1 © ^N(N — 1) (5) 

In particular, for SO (3), 303 = 50103, a relation you should be familiar with from courses on mechanics 
and electromagnetism. 

There are two conventions for naming representations. We can simply give the dimension of the representa¬ 
tion. (This can occasionally be ambiguous: Two distinct representations may happen to have the same dimension.) 
Alternatively, we can specify the symmetry properties of the tensor furnishing the representation. For instance, 
the representation furnished by a totally antisymmetric tensor of n indices is often denoted by [n] and the rep¬ 
resentation furnished by a totally symmetric traceless tensor of n indices by {n}. Obviously, [1] = {1}. In this 
notation, the decomposition in (5) can be written as {1} 0 {1} = {2} 0 {0} 0 [2]. For the group S'0(3), with its 
long standing in physics, the confusion over names is almost worse than in reading Russian novels: For instance, 
{1} is also known as p and {2} as d. 

We have yet to use (2). Using the antisymmetric symbol e 123 " mN , we write (2) as 

ghh-dN ohlQhZ — \ ^ 

or equivalently 

ghh-dN QhJiQhJi ' q^nJn — £ hh---jN 

By multiplying (7) by 0 T repeatedly, we can obviously generate more identities. Instead of drowning in a sea of 
indices, let me explain this point by specializing to say N = 3. Thus, multiplying (7) by (O t )^ nIcn , we obtain 

£*1*2*3 0*iA QhJi — £ hhh^Q T )73*3 

Speaking loosely, we can think of moving some of the 0’s on the left hand side of (7) to the right hand side, 
where they become 0 T ’ s. 

Using these identities, you can easily show that [n] is equivalent to [N — n], that is, these two representations 
transform in the same way. For example, as is well known, in S 0 (3) the antisymmetric 2-index tensor is equivalent 
to the vector. (The cross product of two vectors is a vector.) 

Any orthogonal matrix can be written as O = e A . The conditions (1) and (2) imply that A is real and 
antisymmetric, so that A may be expressed as a linear combination of N(N — l)/2 antisymmetric matrices 
denoted by iJ^: 0 =e ie ' JJlJ (with repeated indices summed over). We have defined as imaginary and 
antisymmetric and hence hermitean. Since the commutator [J 1 ^ , J kl ] is antihermitean, it can be written as a 
linear combination of the i/’s. 

Ironically, some students are confused at this point because of their familiarity with 50(3), which has special 
properties that do not generalize to SO(N). 

In speaking about rotations in 3-dimensional space we can specify a rotation as either around say the third 
axis, with the corresponding generator / 3 , or as in the (l-2)-plane, with the corresponding generator / 12 = —/ 21 . 
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In higher dimensions, for example 10-dimensional space, we can speak of a rotation in the (6-7)-plane, with the 
corresponding generator 7 67 = — 7 76 , but it is nonsense to speak of a rotation around the fifth axis. Thus, to 
generalize to higher dimensions we should write the standard commutation relation [J 1 , J 2 ] = i J 3 for SO (3) as 
[J 23 , / 31 ] = iJ 12 , which can be generalized immediately to 

[J l i,J kl ]= i ( S ik J jl - 8 jk J n + 8 j, J ik - 8 u J jk ) (8) 

The right hand side reflects the antisymmetric character of J lJ = —J Jl . A potential confusion some students 
may have about the notation: J lJ denotes a matrix generating rotation in the (i-j)- plane, a matrix with element 
( J l J) kl in the k -th row and /-th column. The indices i, j, k, and / all run from 1 to N , but in ( J l3 ) kl the set {/ j} 
and the set {kl} should be distinguished conceptually: The former labels the generator and the latter are matricial 
indices when the generator is regarded as a matrix. As an exercise, write down (J l3 ) kl explicitly and obtain (8) by 
direct computation. 

In studying group theory, as I have already remarked, one source of confusion comes from the fact that 
some of the smaller groups, which we tend to encounter first in our studies, have special properties that do not 
generalize. The special property of S O (3) we just noted is due to the fact that the antisymmetric symbol e l3k carries 
three indices and thus J 13 may be written as J k = \e l3k J 13 . For SO (4) the antisymmetric symbol s l3kl carries four 
indices and we can form the combinations \ 13 ± \s l3kl J kl ). Define /_]_ = \{J 23 ± J 14 ), J± = \{J 31 ± J 24 ), and 

J± = \ {J 12 ± J 34 ). By explicit computation, show that [J l + , j{] = ie l3k J+, [J l _, J ] i\ = is l3k J k , and [J+, j[_\ = 0. 
This proves the well-known theorem that SO( 4) is locally isomorphic to SO(3) ® S 0(3). 

I assume that you know that SO( 3) is locally isomorphic to SU (2). If you don’t, I give a brief review below. 

With a few V s included here and there, these two results prove the statement that the Lorentz group SO( 3, 1) 
is locally isomorphic to SU(2) ® SU (2), which we proved explicitly in chapter 11.3. The Lorentz group can be 
thought of as an “analytic continuation” of the rotation group S O (4). See below for a more precise statement. 

One highly non-obvious result of group theory is that S O ( N ) contains representations other than vector and 
tensor. I develop the relevant group theory for the spinor representations in chapter VI1.7. 


SU(N) 

We next turn to the special unitary group SU ( N ) consisting of all A by A matrices U that are unitary 

UW = 1 ( 9 ) 

and have unit determinant 

det U = 1 (10) 

The story of SU ( N ) has more or less the same plot as the story of SO(N) with the crucial difference that the 
tensors of the unitary groups can carry both upper and lower indices. We denote the element in the ith row and 
yth column by U l - \ the wisdom of this notation will soon become apparent. 

The defining or fundamental representation of SU ( N ) consists of N objects <p 3 , j = 1,..., N, that transform 
under the action of the group element U according to 

<p‘ = uy (11) 

Taking the complex conjugate of (11) we have 

<p* { (c/j) y- 7 ' = (u^y iV * j (12) 

We invite ourselves to define an object we write as (p t that transforms in the same way as <p * 1 ; thus 

<p i ^<p' = (Uhi‘P j ( 13 ) 

Note that we did not say that is equal to <p* 1 ) we merely said that <pj and <p* 1 transform in the same way. 

As before, we can have tensors. The tensor (p l £, for example, transforms as if it is equal to the product (p l (p J (p k : 

- <P? = u;ui(uiyy n m 


(14) 
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Again, we emphasize that we did not say that <p l £ is equal to (p l (pi(p k . (In some books (p l is called a covariant vector 
and (pi a contravariant vector. A tensor (p"['“ with m upper indices and n lower indices is defined to transform as 
if it is equal to the product of m covariant vectors and n contravariant vectors.) 

The possibility of complex conjugation in SU ( N ) leads naturally to having indices “upstairs” and “downstairs.” 
Note that (9) can be written out explicitly as (U^) k U£ = 8j and thus the Kronecker delta in SU ( N ) carries one 
upper and one lower index. It is important when taking traces that we set an upper index equal to a lower index 
and sum over them: for example, we can consider 8 k (p^ = <p l j, which transforms as 




(15) 


where we have used (9). In other words, <p l j, the trace of <p£, denote N objects that transform into linear 
combinations of each other in the same way as (p l . Thus, given a tensor, we can always subtract out its trace. 

As in the discussion for SO(N ), tensors furnish representations of the group. The discussion proceeds as 
before. The symmetry properties of a tensor under permutation of its indices are not changed by the group 
transformation. 

Another way of saying this is that given a tensor we can always take it to have definite symmetry properties 
under permutation of its upper indices and under permutation of its lower indices. In our specific example, we 
can always take <p l £ to be either symmetric or antisymmetric under the exchange of i and j and to be traceless. 
Thus, the symmetric traceless tensor <p l £ furnishes a representation with dimension \ N 2 (N + 1) — N and the 
antisymmetric traceless tensor <p l £ a representation with dimension \ N 2 (N — 1) — N. 

Thus, in summary, the irreducible representations of SU ( N ) are realized by traceless tensors with definite 
symmetry properties under permutation of indices. For example, in 5/7(5), some commonly encountered 
representations are (p l , <p lJ (antisymmetric), (p l] (symmetric), (pi, (p l p! (antisymmetric in the upper indices and 
traceless) with dimensions 5, 10, 15, 24, and 45, respectively. Convince yourself that for SU ( N ) the dimensions 
of the representations defined by these tensors are N, N(N — l)/2, N(N + l)/2, N 2 — 1, and \ N 2 (N — 1) — N, 
respectively. 

The representation defined by the traceless tensor (pi is known as the adjoint representation. By definition, 
it transforms according to (pi -> (p'j = UHU^) n -(p l = U)(p l (t/T)". We are thus invited to regard (pi as a matrix 

C ’J J ij'n in J ] 

transforming according to 

(p^<p'= UcpU t (16) 


Note that if (p is hermitean it stays hermitean, and thus we can take (p to be a hermitean traceless matrix. (If (p 
is antihermitean we can always multiply it by i .) Another way of saying this is that given a hermitean traceless 
matrix X, UXU^ is also hermitean and traceless if U is an element of SU ( N ). 

As in the SO(N) story, representations of SU(N) have many names. For example, we can refer to the 
representation furnished by a tensor with m upper and n lower indices as (m, n). Alternatively, we can refer 
to them by their dimensions, with an asterisk to distinguish representations with mostly lower indices from the 
representations with mostly upper indices. For example, an alias for (1, 0) is N and for (0, 1) is N*. A square 
bracket is used to indicate that the indices are antisymmetric and a curly bracket indicate that the indices are 
symmetric. Thus, the 10 of SU( 5) is also known as [2, 0] = [2], where as indicated the 0 (no lower index) is 
suppressed. Similarly, 10* is also known as [0, 2] = [2]*. 

The condition (10) can be written as either 


s- ■ ■ u ll u l2 u lN — 1 

b i 1 i 2 ...i N u l u 2 • * * U N ~ A 

or 

gilh-i N [/If/2 jjN = 1 

'1 '2 'N 


(17) 

(18) 


Thus, we have two antisymmetric symbols fy i i and s lll2 ‘“ lN that we can use to raise and lower indices. Again, 
we can immediately generalize (17) to 


s- • • U U. U =£■■ 

b lll 2 ---lN U Ji U J 2 “ - U JN b JlJ2---JN 


and multiplying this identity by (U^) J ^ N and summing over j N we obtain 
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Clearly, by repeating this process, we can peel off the U’s on the left hand side and put them back as U^’s on the 
right hand side. We can play a similar game with (18). 

To avoid drowning in a sea of indices, let me show you how to raise and lower indices in a specific example 
rather than in general. Consider the tensor <p£ in 5/7 (4). We expect that the tensor <Pk pq = <P^ £ ijpq will transform 
as a tensor with three lower indices. Indeed, 


n Pq - ^e ijpq - s ijpq u;ui(u-%^ = e lmsA urym>mic = (u%m s p my nst 


n.Jm 


As in SO(N) we can look at the generators of SU ( N ) by noting that any unitary matrix can be written as 
U = e lH , with H hermitean and traceless as required by (9) and (10). There are (N 2 — 1) linearly independent 
A by A hermitean traceless matrices T a (a = 1, 2, ..., N 2 — 1). Any A by A hermitean traceless matrix can be 
written as a linear combination of the T a ’s and thus we can write U = e ld T , where 6 a are real numbers and the 
index a is summed over. 

Since the commutator [T a , T b ] is antihermitean and traceless, it can also be written as a linear combination 
of the T a ’s: 


[T a , T b ] = if abc T c (19) 

(with the index c summed over.) The commutation relations (19) define the Lie algebra of SU ( N ), and f abc are 
known as the structure constants. For SU (2) the structure constants f abc are simply given by the antisymmetric 
symbol s abc . 

Sometimes students are confused by how the generators act. Consider an infinitesimal transformation 
U ~ 1 + i6 a T a . On the defining representation, (p l —> U l -(p* — <p l + i6 a {T a ) l -(pK Thus, the ath generator acting 
on the defining representation gives T a (p. Now consider the adjoint representation (16) 

<p-+<p'~( 1 + ie a T a )(p(l + iO a T a )t ~cp + i9 a T a <p - (pi6 a T a = <p + W a [T a , <p] (20) 

In other words, the ath generator acting on the adjoint representation gives [T a , <p\. Perhaps some students are 
confused by the fact that (p is used as a generic symbol to denote different objects. 

Since the adjoint representation (p is hermitean and traceless it can also be written as a linear combination of 
the generators, thus (p = (p b T b . Using (19) we can thus also write (20) as (p c —> <p ,c ~(p c — f abc 6 a (p b . In particular 
for 5/7(2), the three objects (p a transform as a 3-vector. (Note the notation: (p a is not to be confused with (p l : in 
SU (2) the index a = 1, 2, 3 while i = 1, 2.) 

This last remark essentially amounts to a proof that SU (2) is locally isomorphic to 5(9(3). I will now give a 
somewhat more formal proof. Any 2 by 2 hermitean traceless matrix X can be written as a linear combination of 
the three Pauli matrices X = x • a with three real coefficients (x 1 , x 2 , x 3 ), which we regard as the components of a 
3-vector x. For any element U of 5/7(2), X' = U t XU is hermitean and traceless, so that we can write X' = x' • a. 
Note that we have implicitly used the first defining property of an SU (2) matrix (9). By explicit computation, 
we find detX = —x 2 . Invoking the second defining property of an 5/7(2) matrix (10), we obtain detX' = detX 
and thus x' 2 = x 2 . The 3-vector x is rotated into the 3-vector x'. Thus we can associate a rotation with any given 
U . Since U and —U are associated with the same rotation, this gives a double covering of 50(3) by 5/7 (2). A 
physicist would just say that when a spin j particle is rotated through 2n, its wave function changes sign. The 
map clearly preserves group multiplication: if two elements Z7j and U 2 of 5/7 (2) are mapped to the rotations Ri 
and R 2 respectively, then the element U\U 2 is mapped to the rotation R\R 2 . Alternatively, noting that trX 2 = x 2 
and tr X' 2 = trX 2 , we obtain the same conclusion. 

Once again, the two special unitary groups that most students learn first, namely 5/7(2) and 5/7(3), have 
special properties that do not generalize to 5/7( N ), just as 50(3) has special properties that do not generalize to 
SO(N), possibly leading to confusion. 

For 5/7(2), because the antisymmetric symbol s lJ and carry two indices, it suffices to consider only tensors 
with upper indices, all symmetrized: We can raise all lower indices of any tensor by contracting with s'i repeatedly. 
After this is done, we can remove any pair of indices in which the tensor is antisymmetric by contracting with 
£ ij • 

In particular, (p l = £ lJ (Pj, which can be stated equivalently in terms of a special property of the Pauli matrices 


°2 a a °2 — —°c 


( 21 ) 
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so that 

a 2 (e i§ *y* 2 = e i§S ( 22 ) 

For SU (2) (11) becomes 

P <p n = (e re5 yy 

Complex conjugating, we obtain 
<p*‘ -»■ Ke i35 yjT<p*j = [{-io 2 )e re5 (iojijv*] 
and so 

i<J2<P* —► e l ®° (ia 2 (p*) 

We learn that icr 2 <p* transform in the same way as (p. Recall that we define cp t to transform in the same way as (p * 1 . 
Thus, s lJ (pj transforms in the same way as (p l . In the jargon, SU (2) is said to have only real and pseudoreal 
representations, but not complex representations. A pseudoreal representation is equivalent to its complex 
conjugate upon a similarity transformation. Recall that (21) figures into our discussion of charge conjugation in 
chapter II.l and of the Higgs doublet in chapter VI1.2. 

For 5/7(3) it suffices to consider only tensors with all their upper indices symmetrized and all their lower 
indices symmetrized. Thus, the representations of 5/7(3) are uniquely labeled by two integers (m, n), where m 
and n denote the number of upper and lower indices. The reason is that the antisymmetric symbols s‘^ k and 
£ij k carry three indices. We can always trade a pair of lower indices in which the tensor is antisymmetric for one 
upper index, and similarly for upper indices. 

You can see easily that these special properties do not generalize beyond 5/7(2) and 5/7(3). 


Multiplying representations together 

In a course on quantum mechanics you learn how to combine angular momentum. We have already encountered 
this concept in (5), which when specialized to 5(7(3), tells us that 3 0 3 = 50103, as we noted. This is 
sometimes described by saying that when we combine two angular momentum L = 1 states we obtain L = 0,1,2. 
Students are justifiably confused when this procedure is also known as addition of angular momentum. 

Given two tensors (p and r/ of 5/7 ( N ), with m upper and n lower indices and with m' upper and n! lower indices, 
respectively, we can consider a tensor T with (m + m') upper and (n + n') lower indices that transforms in the 
same way as the product (pij . We can then reduce T by the various operations described above. This operation of 
multiplying two representations together is of course of fundamental importance in physics. In quantum field 
theory, for example, we multiply fields together to construct the Lagrangian. 

As an example, multiply 5* and 10 in 5/7(5). To reduce T k = (p k r\ 1 ^ we separate out the trace (p k r) k i (which 
transforms as a 5) after which there is nothing more we can do. Thus, 

5* 0 10 = 5 © 45 (23) 

As another example, consider 10 0 10: (p^r] kl . It is easiest to write rj kl equivalently as a tensor with three lower 
indices Smnhki 1 !^• The product 10 0 10 then carries two upper and three lower indices and we will write it as T^ nh . 

Taking traces, we separate out 7^., which we recognize as 5*, and the traceless part of T^ n j, which we recognize 
as 45* (see above), thus obtaining: 

10 (0 10 = 5* © 45* © 50* (24) 

As exercises you can work out 

5 0 5 = 100 15 (25) 

and 

5 0 5* = 1 © 24 (26) 


You should recognize the 24 as the adjoint. 
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In physics we are often called upon to multiply a tensor by itself. Statistics then plays a role. For instance, 
5/7 (5) grand unification contains a scalar field ip 1 transforming as 5. Because of Bose statistics, the product (p l (p J 
contains only the 15. 


Restriction to subgroup 


To explain the next group theoretic concept, let me take a physical example. The S U (3) of Gell-Mann and Ne’eman 
transforms the three quarks u, d, and s into linear combinations of each other. It contains as a subgroup the 
isospin 5/7 (2) of Heisenberg, which transforms u and d, but leaves 5 alone. In other words, upon restriction to 
the subgroup SU (2) the irreducible representation 3 of SU (3) decomposes as 

3^201 (27) 

Consider an irreducible representation with dimension d of some group G. When we restrict our attention 
to a subgroup H , the set of d objects will in general decompose into n subsets, containing d\, di, ..., d n objects, 
such that the objects of each subset only transform among themselves under the action of H. This makes obvious 
sense since there are fewer transformations in H than in G. 

The decomposition of the fundamental or defining representation specifies how the subgroup H is embedded 
in G. Since all representations may be built up as products of the fundamental representation, once we know 
how the fundamental representation decomposes, we know how all representations decompose. For example, 
in 5/7(3) 

3 0 3* = 8 © 1 (28) 

while in SU (2) 


(2 © 1) 0 (2 © 1) = (3 © 1) 0 2 © 2 © 1 (29) 

Comparing (28) and (29) we learn that 


8^3010202. (30) 

Alternatively, we can simply look at the tensors involved. Consider <p' of SU (3) where the index i takes on the 
value 1, 2, 3. Let the index /x takes on the value 1, 2. Obviously, cp l = {(p^, <p 3 } corresponds to an explicit display 
of (27). Then y l . = {ip%, (p$ , , ip\), where the bar on ip^ is to remind us that it is traceless. This corresponds 

precisely to (30). 

Actually, SU (3) also contains the larger subgroup 5/7(2) (g> /7(1), where the U (1) is generated by the traceless 
hermitean matrix 


/ -l 

O 

O 

0 

1 

O 

l 0 

0 2/ 


We can then write (27) as 3 —> (2, —1) 0 (1, 2), where the notation is almost self-explanatory. Thus, (2, —1) 
denotes a 2 under SU (2) with “charge" —1 under U{ 1). 

In the text, we will decompose various representations of 5/7(5) and 50(10). Everything we do there will 
simply be somewhat more elaborate versions of what we did here. 


More on SO(4), SO(3,i), and SO(2,2) 

In chapter II.3 you learned that acting on the two objects t/g with a = 1, 2 in the spinor representation (|, 0), 
the generators of rotation and boost are represented by 7, = and iK t = , respectively. I remind you that 

the equal sign means “represented by.” For most purposes (for example, classifying quantum fields) and at the 
level of rigor of this book, it suffices to think of the Lie algebra generated by commuting J t and K t . Occasionally, 
however, it is useful to contemplate the actual group with group elements e ldJ and e l(pK . 

In the spinor representation (j, 0) the group elements are represented by e l6 2 and e v ?. While e ld ? is special 

unitary, the 2 by 2 matrix , bereft of the i, is merely special but not unitary. (Incidentally, to verify these and 
subsequent statements, since you understand rotation thoroughly, you could, without loss of generality, choose 



532 | Appendix B. Brief Review of Croup Theory 


ip to point along the third axis, in which case ? is diagonal with elements e ? and e~ ?. Thus, while the matrix 
is not unitary, its determinant is manifestly equal to 1.) This set of matrices defines the multiplicative group 
SL(2, C), consisting of all 2 by 2 complex-valued matrices with unit determinant. 

Let us count the number of generators of this group. Two conditions on the determinant (real part = 1, 
imaginary part = 0) cut the four complex entries containing eight real numbers down to six numbers, which 
accounts for the six generators of the Lorentz group 50 ( 3, 1). 

To exhibit the map explicitly, we extend the earlier discussion showing that SU (2) covers S'0(3). Consider the 
most general 2 by 2 hermitean matrix 


X M = x°I — x • a = 


x 1 + ix 2 


x 1 — ix 2 \ 
x° + x 3 j 


(31) 


By explicit computation, detX^ = (x 0 ) 2 — x 2 . (To see this instantly, choose x to point along the third axis 
and invoke rotational invariance.) Now consider X' M = LXX M L, with L an element of 5L(2, C). Manifestly, 
detX^ = &eXX M and thus the transformation preserves (x 0 ) 2 — x 2 and hence corresponds to Lorentz transfor¬ 
mations. Since L and —L give the same transformation x —► x' , we see that SL(2, C) double covers S0(3, 1). 
Mathematicians say that S0(3, 1) = SL(2, C)/Z 2 . If L is also unitary, then x 0/ = x° and the transformation is 
a rotation. The SU (2) subgroup of 5L(2, C ) double covers the rotation subgroup £0(3) of the Lorentz group 
50(3, 1), that is, 50(3) = 5/7(2)/Z 2 . 

Incidentally, if we introduce an i at a strategic location and define the 2 by 2 matrix X E = x 4 / + ix • <r, regarding 
(x, x 4 ) as a 4-dimensional vector, we have detX^ = (x 4 ) 2 + x 2 , the Euclidean length squared of the 4-vector. 
(Once again, choose x to point along the third axis so that X E is a diagonal matrix with elements x 4 d= /x 3 .) Since 
e l ° i = cos | + i sin | (0 • cr) with 6 a unit vector in the 6 direction (to see this, once again choose 6 to point along 
the 3 rd axis), we see that X E /((x 4 ) 2 + x 2 ) 2 is an element of SU (2). (We will come back to this observation in the 
next section.) Thus, for any two elements U and V of SU (2), the matrix X' E = V^X E U can also be decomposed 
in the form X' E = x ,4 I + ix' • a. Evidently, detX^, = detX^. Thus the transformation preserves (x 4 ) 2 + x 2 and 
describes an element of 50(4). This shows explicitly that 50(4) is locally isomorphic to 50(2) 0 5/7(2). If 
V = U, we have a rotation, and if = U, the Euclidean analog of a boost. 

Note that while the rotation group 50(3) is compact, the Lorentz group 50(3, 1) is not, since the range of 
the boost parameters ip is unbounded. In contrast, the group 5 0 (4) is compact and thus can be covered by a 
compact group, namely, SU (2) 0 SU (2), but the noncompact group 50(3, 1) cannot be. 

At this point, having done 50(4) and 50(3, 1), I might as well (with a wink toward the nuts who complained 
that this book is not encyclopedic enough) throw in the group 50(2, 2) for use in part N. Let us strip the Pauli 
matrix o 2 (kind of a “troublemaker" or at least an odd man out) of his i and define (just for this paragraph) 


a = 


-1 

0 


Any real 2 by 2 matrix X H could be decomposed as X H =x 4 / + x -a. NowdetX# = (x 4 ) 2 + (x 2 ) 2 — (x 3 ) 2 — (x 1 ) 2 , 
the quadratic form of a spacetime with two time and two space coordinates. The set of all linear transformations 
(with unit determinant) on (x 1 , x 2 , x 3 , x 4 ) that preserve this quadratic form defines the group 50(2, 2). 

Introduce the multiplicative group SL (2, R ) consisting of all 2 by 2 real-valued matrices with unit determinant. 
For any two elements and L r of this group, consider the transformation X' H — L,X H L r . Evidently, detX^ = 
detX//.This shows explicitly that the group 50(2, 2) is locally isomorphic to 5L(2, R) 0 5L(2, R). Although two- 
timing theories are bound to be trouble, we could use 50(2, 2) formally in computing scattering amplitudes, as 
we will see in chapter N.3. 


Topological quantization of helicity 


As promised, let us go back to the observation in the previous section that the matrix X £ /((x 4 ) 2 + x 2 ) 1 is an 
element of SU (2). Define w A = x A /((x 4 ) 2 + x 2 ) ? for A = 1, 2, 3, 4. An arbitrary element of SU (2) can be written 
as U = w A l + iw ■ a, with det U = 1 = (to 4 ) 2 + to 2 . The 4-dimensional unit vector to = (to 4 , to) traces out the 
3-sphere S 2 , the surface of the 4-ball B 4 living in 4-dimensional Euclidean space. Thus the group manifold of 
SU (2) is S 3 . 

Next, recall that SU(2) double covers the rotation group SO( 3), or in plain talk, two elements U and —U of 
SU (2) corresponds to the same rotation. Thus the group manifold of 50(3) is 5 3 /Z 2 , that is, the 3-sphere with 
antipodal points identified. 
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Consider closed paths in SO (3). Starting at some point P on S' 3 , wander off a bit and come back to P . The 
path you traced can evidently be continuously shrunk to a point. But suppose you go off to the other side of 
the world and arrive at —P, the antipodal point of P . You also trace a closed path in SO( 3) since P and —P 
correspond to the same element of SO(3), but this closed path obviously cannot be shrunk to a point. On the 
other hand, if after arriving at — P you keep going and eventually return to P , then the entire path you traced 
can be continuously shrunk to a point. Using the language of homotopy groups introduced in chapter V.7, we 
say that 11^(50(3)) = Z 2 : there are two topologically inequivalent classes of paths in the 3-dimensional rotation 
group. 

Now we can go back and tie up a loose end in chapter 111.4. Back in school you learned that the nonlinear 
algebraic structure of the Lie algebra [7j, Jj] = i^ijkJk enforces quantization of angular momentum. But the little 
group for a massless particle is merely 0(2). In the “rich man’s approach” to gauge invariance, how do we get 
the helicity of the photon and the graviton quantized? 

The answer is that we invoke topological, rather than algebraic, quantization. A rotation through 47T is 
represented by e l4nh on the helicity h state of the massless particle, but the path traced out by this rotation 
can be continuously shrunk to a point. Hence, we must have e l4nh = 1 and h = 0 , ± j , ± 1 , .... 
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Feynman Rules 


Here we gather the Feynman rules given in various chapters. 

Draw all possible diagrams. Label each line with a momentum. If applicable, also label each line with an 
incoming and an outgoing Lorentz index (for a line describing a vector field), with an incoming and an outgoing 
internal index (for a line describing a field transforming under an internal symmetry), so on and so forth. 
Momentum is conserved at each vertex. Momenta associated with internal lines are to be integrated over with 
the measure f[d 4 p/(27z) 4 ]. A factor of (—1) is to be associated with each closed fermion loop. External lines 
are to be amputated. For an incoming fermion line write u{p, s) and for an outgoing fermion line u(p', s'). 
For an incoming antifermion, write v(p, s), and for an outgoing antifermion, v(p ', s'). If there are symmetry 
transformations leaving the diagram invariant, then we have to worry about the infamous symmetry factors. 
Since I don't trust the compilations in various textbooks I work out the symmetry factors from scratch, and that 
is what I advise you to do. 


Scalar field interacting with Dirac field 

C = f{iy ,x d lx -m)f + l[(3<p ) 2 - mV] - 7 V + f‘Pf'1' (1) 

2 4! 


Scalar propagator: 


k 

> 


k 2 — [x 2 + is 


( 2 ) 
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Scalar vertex: 


Fermion propagator: 




✓ \ 

✓ \ 


—iX 


p 


p + m 


p — m + is p 2 — m 2 + is 


Scalar fermion vertex: 


( 3 ) 


( 4 ) 


! if ( 5 ) 

I 

I 

-►- 1 -►- 

Initial external fermion: 

u(p,s) (6) 

Final external fermion: 

u(p, S ) (7) 

Initial external antifermion: 

v(p,s) (8) 

Final external antifermion: 

v(p,s) (9) 


Vector field interacting with Dirac field 

C = - ieA^) - m)f - \F^ V - \p 2 A^ (10) 

Vector boson propagator: 


k 

cflAAATVAAAAAAAO 




Photon propagator (with £ an arbitrary gauge parameter): 


(ii) 


i 

k 2 


a - o 


Is Is 
K /jL^V 


S/xv 


k 

j\aaaaaAw\aaap 


( 12 ) 
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Vector boson fermion vertex: 



Initial external vector boson: 


s^(k) 

Final external vector boson: 

M*)* 


iey 11 


(13) 


(14) 


(15) 


Nonabeiian gauge theory 


Gauge boson propagator: 


k 

Ghost propagator: 


i 

V 2 


k k v 

(i 


°ab 


(16) 


k 

- 

Cubic interaction between the gauge bosons: 



(17) 


a, p 



gf abc [g^(k i - k 2 ) x + g vX (k 2 - + g ^(*3 - h\,} (18) 


Quartic interaction between the gauge bosons: 



d,p 


ig 2 [f abe f cde (g^gv P ~ gppgvi) 

+ f‘ lde f Cbe (gfi\gvp — gpvgpk) 
+ f ace f bde (gpvg\p - gppgvD\ 


( 19 ) 
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Gauge boson coupling to the ghost field: 



Cross sections and decay rates 


gf abc P^ 


( 20 ) 


Given the Feynman amplitude A4 for a process Pi + P 2 —>■ ki + k 2 + k n the differential cross section is 
given by 


|5i - v 2 \£(p{)£(p 2 ) (2iz) i £(k l ) (2n) 3 £(k n ) 


(27z) A 8 w (p 1 + p 2 -Y J k i )\M\ 2 


Here v-\ and v 2 denote the velocities of the incoming particles. The energy factor £(p) = 2-Jp 2 + tti 2 for bosons 
and E{p) = y/p 2 + m 2 /m for fermions come from the different normalization of the creation and annhilation 
operators in chapters 1.8 and 11.2. 

For a decay of a particle of mass M the differential decay rate in its rest frame is given by 


1 d^ki d}k n 

2M (2 7t) 3 S(k{) ’'' (27 r) 3 S(k n ) 


(2n) 4 8 iA) (P 

1=1 


( 22 ) 
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Gamma matrices 

Identities for the trace of a product of an even number of gamma matrices: 


try V = V v 

(1) 

tr y'V/V' 7 = 4(^V ff - 'fV 0 + ^ a rf x ) 

(2) 

We define the totally antisymmetric symbol g ,iuA ' r by g 0123 = + 1 (note £0123 = 

— 1). Then with our definition 

y 5 = iy°y 1 y 2 y 3 , we have 

tr yV'Vy V CT = -4jg' ivAor 

(3) 

Identities that follow from the basic Clifford identity: 

= -2 p 

(4) 

Y^f> AYh = a P' c 1 

(5) 

Y tl P<j/Yv l = - T-t/4P 

(6) 


I leave it to you to derive these identities. For example, to obtain (4) keep moving y^ to the right in the expression 
Y^PYh = V-P 11 ~ = 2p-4p = -2p. 


Evaluating Feynman diagrams 


Over the years, a number of tricks and identities have been developed for evaluating the integrals associated with 
Feynman diagrams. 

Let us evaluate 


r d A k 1 

f d\ 

f dk 0 1 

/ (27r) 4 (k 2 — m 2 + t'g) 3 J 

(2t r) 3 J 

2n \k 2 — ( k 2 + m 2 ) + /g] 3 


Focus on the &o integral. Draw where the poles are in the complex &o _ plane and you will see that the integration 
contour can be rotated anticlockwise so that [we denote the integrand by /(& 0 )] 
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where in the last step we define k 0 = ik 4 (corresponding to the Wick rotation mentioned in chapters 1.2 and V.2.) 
Thus, 




d%k 1 
(27r) 4 { k \ + m 2 ) 3 


where d E k is the integration element in Euclidean 4-dimensional space and k 2 E =k 2 -\- k 2 the square of a Euclidean 
4-vector. The infinitesimal s can now be set equal to zero. We can integrate immediately over the three angles 
since the integrand does not depend on them. You can look up the angular element in Euclidean space in a book, 
but we will use a neat trick instead. 

I will do the more general d-dimensional integral H = J d d kF{k 2 ), where k 2 = k 2 + k 2 + • • • + k^ and F can 
be any function as long as the integral converges. (I now drop the subscript E\ the context makes clear that we 
are in Euclidean space.) We can of course set d equal to 4 at the end. The result for arbitrary d will be useful to 
us in regularizing dimensionally (chapter III.l). 

We imagine integrating over the ( d — 1) angular variables to obtain H = C{d ) J 0 °° dk 
k d ~ l F{k 2 ). To determine C(d) we will do the integral J = f d d ke~i k2 in two different ways. Using (1.2.8) we 
have J = (V2jT) d . Alternatively, 


J = C ( d ) 


r 

Jo 


dkk d ~ l e ~ 2 k = C ( d ) 2 i 


-r 

Jo 


d 

dx X 2 


= C ( d ) 22 - 


‘ r< i» 


where we changed integration variables and recognized the integral representation of the gamma function 
T(z+ 1 ) = / 0 °° dx x z e~ x . (Recall that upon integration by parts we obtain T(z + 1 ) = zT(z), so that T(;;) = (n — 1)! 
for n an integer.) Therefore C(d) = 2n d l 1 / T(d/ 2) and 


f d d kF ( k 2 ) = r dk (8) 

J r ( d / 2 ) Jo 

Setting d = 1 in (8) we determine 2 , and setting F(k z ) = 8 (k — 1) we see that the area of the (d — 1)- 

dimensional sphere is equal to C{d), thus recovering various results you learned in school about circles and 
spheres: C(2) = 2jt and C(3) = 47T. 

The new result you need as a budding field theorist is for d = 4: 


f 


d 4 kF ( k 2 ) = 7r 2 


r 

Jo 


dk 2 k 2 F ( k 2 ) 


So finally we have 


I = 


— r 

167T 2 Jo 


dk 2 k 1 


—i 1 


( k 2 + 7H 2 ) 3 167r 2 2 m 2 


( 9 ) 


( 10 ) 


We have derived the basic formula for doing Feynman integrals: 

r d^k _ 1 -( 

J (2jt) 4 (k 2 — m 2 + is ) 3 327r 2 m 2 


( 11 ) 


(With the telltale is we have evidently moved back to Minkowski space.) As an exercise you can go through the 
same steps to find 


r A d 4 k i _ i 

J (2n) 4 (k 2 — m 2 + is ) 2 167r 2 

Here a cutoff is needed, which we introduce by setting the upper limit in the integral over k 2 in the analog of 
(10) to A 2 . As a check, differentiate (12) with respect to m 2 to recover (11). As another exercise show that 



r A d 4 k k 2 
J (2jt) 4 ( k 2 — m 2 + is ) 2 


167T 2 


A 2 — 2 m 2 log 




+ nF + • ■ 


(13) 


In (12) and (13) (• • •) denote terms that vanish for A 2 m 2 . In some texts, the (—1) in (12) is dropped by absorbing 

it into A 2 . But then we have to be careful to adjust (13) accordingly if it appears in the same calculation. 
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A useful identity in combining denominators is 


! /T-/‘ 


* 1*2 •••*/! 


s !-E‘ 


= (n- 1)! 


dayia 2 . . . da n 


(<*1*1 + »2*2 H-1- u n x n ) n 


— : —i 

*y Jo [ocx + (1 - a)y] z 


and for n = 3, 


-W/7 

cyz Jo Jo Jo 

-«// 

J J triangle 


dad/3dy8(a + /3 + y — 1) 


(ax + yS_v + yz ) 3 


= 2/ / rfarf/3-- 

J triangle [z + a(.T - z) + yS(y - z)] J 

where the integration region is the triangle in the a-fi plane bounded byO</S<l — a and 0 < a < 1. 


(14) 

(15) 

(16) 



Appendix E 


Dotted and Undotted Indices and the Majorana Spinor 


We develop the dotted and undotted notation introduced in chapter 11.3 for further use in discussing supersym¬ 
metry in chapter VI11.4 and in part N. In essence, the appearance of undotted and dotted indices can be traced 
back to the fact that the algebra of the Lorentz group SO( 3, 1), with the generators J + iK and J — iK, breaks 
up into two pieces, each isomorphic to the algebra of SU (2). The absence or presence of the dot allows us to keep 
track of which SU (2) we are talking about. 

Here I will use extensively results from chapter II.3 and from the exercises (do them!) there without bothering 
to write them down again here. 

In the Weyl basis of chapter II.l 



where cr^ = (/, a) and = (/, — a). Knowing that y^ acts on 



we see that and carry indices as follows: 


(i) 


(<r")aa and (2) 

This is consistent with what you know: the Lorentz vector transforms like (j > 5) and thus straddles the two 
SU (2)’s. The matrices cr^ 1 and mix dotted and undotted indices. We will make good use of this observation 
later. 

Let us check that the Lorentz transformation property of the Dirac spinor h is consistent with what was 
discussed in chapter II.l. There we learned that -*■ e~ 'I', where S'" = j[y^, y v ]. (We want to use the 
symbol cr 1 *'' for some other quantity, hence the change of notation.) Using (1) we obtain 

( a ^ 

£ ,iu = 2 i 

\ 0 

where <7 mv = ^(a^a v — <r v d^) andd^ v = ^(d / V v — dV^). From (2) we see that these two matrices carry indices 
as follows: 

Or" V and $ 



(3) 
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Again, this reflects the fact that the antisymmetric tensor (such as the electromagnetic field F^ v ) transforms like 

( 1 , 0 )+ ( 0 , 1 ). 

The matrices o^ v and a^ v may seem alien, but recall that they are manufactured out of the familiar Pauli 
matrices and so they are simply Pauli matrices (what else could they be?) themselves. In particular, 

er 0 ' = —o' 0 ' — —~a‘ and o l > = a‘> = -- s iJk a k 

2 2 

Note that these relations are consistent with (<r^ v )'i" = — (d^ v ), which in turn follows from (£^ v )1‘ = y 0 S /iV y 0 . 

Mother Nature is kind to the students of quantum field theory. The relativistic spinor ^ breaks up into two 
2-component spinors acted on by the Pauli matrices. What you learned in nonrelativistic quantum mechanics 
continues to be relevant here. 

Thus under an infinitesimal Lorentz transformation 

fa (y 1 + fp ( 4 ) 

and 

**-►(/+ . + ( 5 ) 

You should check that it all works out according to plan. Everything is consistent with what we learned in chapter 
II.3, in particular, that boosts act oppositely on (^, 0) and (0, j), but rotations act the same. 

Thus far, on the spinor fields \fr a and x a , the dotted indices always live upstairs and the undotted indices 
downstairs. What would get them to change floors? Charge conjugation. 

Recall from chapter II.1 that the charge conjugated field is defined by ^ = C'P r [where T denotes transpose, 
'P means 'P’i"y°, and C _ 1 y /x C = In the Weyl basis, we can choose 

C = ty°y 2 = s( ° 2 (6) 

V 0 ff 2/ 

The condition (*P C ) C = ^ implies |£| = 1. We choose £ = —i. Explicitly, 



We now introduce some notation, the wisdom of which will soon become clear. Given and x a > define 

fa = (fa)* an d X“ s (X“)* (7) 


Weird, complex conjugation puts on a dot and a bar. 

We raise and lower undotted indices as follows: \ls a = S a pf fi arid = e^xjfy which implies that £ a peP y = S a Y . 

Thus, if we choose 


/ 0 

S <*P — I i J- 
then 


eft* 



(-«> 2 / y 


We are forced to define f 12 = +1 and s 12 = — 1 to have opposite signs, a fact to keep in mind. 

You should realize by now that what we are doing can again be traced back to that peculiar fact about Pauli 
matrices (appendix B): 

(ia 2 )a*(-io 2 ) = -a t (8) 

or equivalently 

2 = -or,- 


(9) 
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an identity in one guise or another familiar from quantum mechanics. We have used it again and again, in 
appendix B and in the text (for example, in connection with Majorana masses and with the Higgs field). From 
(8) we have (icr^cr^i—icr 2 ) = o' 1 and hence 

(ia 2 ){a llv )*(-io 2 ) = d llv ( 10 ) 

Analogously, we raise and lower dotted indices as follows: ^ and \[r^ = sxj/y. Referring to (7) we 

see that s^p is numerically the same as e a p, and s is numerically the same as s^ Y . 

You now see the rationale of these apparently capricious choices: we can now write 



Referring to 



we see that the point of the notation is that xjr a and x a transform in the same way and are the same kind of 
creature (and similarly for x 01 and \J/ a .) 

We now come to the all-important concept of a Majorana spinor. Ettore Majorana, a brilliant physicist, 
mysteriously disappeared early in his career. Fermi supposedly described Majorana as “a towering giant without 
any common sense.” 1 

Given a Dirac spinor *F, if = 'F 0 , then 'F is said to be a Majorana spinor. 

Comparing (12) and (11), we see that a Majorana spinor has the form 


^M = 



(13) 


An obvious remark but a handy mnemonic: Given a Weyl spinor \Jf a we can construct a Majorana spinor, and 
given two Weyl spinors we can construct a Dirac spinor: one Weyl equals one Majorana, and two Weyls equal 
one Dirac. 

Incidentally, another way of seeing that complex conjugation puts on a dot is that (see chapter II.3) conjugation 
interchanges / + iK and J — iK. 

The point to remember is simply that given a spinor X a , then X& transforms like (A a )*. You should verify this, 
keeping in mind (10). 

The utility of the notation is similar to that of the covariant and contravariant (or upper and lower) indices in 
special and general relativity. We always contract an upper index with a lower index. Here we have the additional 
rule that an undotted upper index can only be contracted with an undotted lower index, but never with a dotted 
lower index, (obviously, since they belong to different algebras.) It is easy to verify these rules. For example, let 
us show that r] a ijr a is invariant. Using (4) we proceed with laboriously careful pedagogy: 


= s“V/,== {e-^ry 


P yp ‘ 


(14) 


where we used once again the identity (9). Then r\ a xjr a —> r](e~ ) T = 771 /r, which is indeed an invariant. 

In special and general relativity we raise and lower indices with the metric, which is of course symmetric. 
Here we raise and lower indices with the antisymmetric s symbol and as a result signs pop up here and there. 
For example, r] a \f/ a = e a ^r]pxl/ a = r]p(—£^ a )\f/ a = — ripil/P. Contrast this with the scalar product of two vectors 
= v^w^. If we want to suppress indices and write rjxJ/, we must decide once and for all what that means. 
The standard convention is to define 


Vf = ( 15 ) 

and not 77 ^^^. This rule is sometimes stated by saying that in contracting undotted indices we always go from 
the northwest to the southeast, and never from southeast to northwest. As we learned in chapter II.5, spinor 
fields are to be treated as anticommuting Grassman variables under the path integral, so that — 77 ^^^ = 

We end up with the nice rule rjxf/ = xf/r]. 


1 M. Gell-Mann, private communication. Incidentally, the name Ettore corresponds to Hector in English. 
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Similarly, we define 

Xt=Xai* = tX ( 16 ) 

In contracting dotted indices we always go from southwest to northeast. Of course, none of this “Santa Barbara 
to Cambridge” convention is needed if the indices are displayed explicitly. 

Just as in special and general relativity, where the upper and lower indices are very useful in telling us 
whether expressions we write down make sense, the undotted and dotted upper and lower indices allow us 
to see immediately that 77 ^ and r]a^ v \l/ make sense, but that rja^xl/ does not. [Look at (2) and (3) and notice the 
kind of indices that appear.] The notation of course just codifies in a convenient way the group theory fact that 
{\, 0) 0 (j , 0) = (0, 0) 0 (1, 0), namely, that out of two Weyl spinors we can make a scalar and a tensor but not 
a vector. 

As always, notation should be driven by physics and computational convenience (which is intimately con¬ 
nected to elegance). 

To gain familiarity with the dotted and undotted 2-component notation, you should work out some of the 
identities in the exercises. These identities are useful when working with supersymmetric field theories. 


Exercises 

E.i Show that ria^xj/ = —xl/a^‘ v r] and xtf^i/f = 

E .2 Show that (0<p)(xl) = ~\ (^lXx^)- 

E -3 Show that 9 a 9p = \ (99)S^. [Hint: simply evaluate the two sides for all possible cases.] 



Solutions to Selected Exercises 


1 . 3.1 From the text we have for x° = 0, 


D(x) = —i J 


(27r) 3 2\/ k 2 + m 2 
dkk 2 


2 n) 2 fo 


Jo y/k 2 -j- mJ i 


d( cos 0)e 


J_ r dkk (c «r c -ikr )= 1 f 

ln) 2 r Jo yp _|_ m 2 8 jr 2 r J- 

_d_ [°° dk j kr 
Sn 2 r dr J -«, 7^2 + m 2 


yfc 2 + m 2 


The integrand in I = /^(dk/yjk 2 + m 2 )e lkr has a cut along the imaginary axis going from im to i oo (and 
another cut we don’t care about.) So fold the contour around the cut and change variable to k = i(m + y): 


1 = 21 dye 


—(y+m)r _ 


' (y + m) 2 — m 2 




546 | Solutions to Selected Exercises 


At this point you can look in a table and find that this is some Bessel function and read off the large r 
behavior, but it is more stylish to press on and descend steeply: we obtain 


/»oo 

D(x) = — / dr (cosh t )e~ mrcosht 

4rrV Jo 

/»oo 

= —/ d(smht)e-"' rcosht 
47r 2 r Jo 

= —— f°° dse~ mr ^ I + 1 ~- — 

47 r 2 r Jo 


47 r 2 r 


f 


dse- mr(1+1 2 s2) 


4-n 2 2(wr) 3 


)Je 


using the Gaussian integral from the appendix of chapter 1.2. 


I. 3.2 We evaluate 


DU) = 


-1 


d 2 k 


,ikx 


(2n) 2 k 2 — m 2 + is 
by contours as in the text and obtain 
dk 


D(x) = - 


-7 


(2jt)2 co k 

For jc° = 0, we recognize the integral 

r + °° dk 


[g-'W-fa^u 0 ) + 


DU) = —i 


•L 


—ikx 


{2n)2\Jk 2 + in 2 

as a Bessel function from exercise 1.3.1: 


DU) = z~Ko(m\x\) -JJ —« 

2?r 27r V 2ra|.v| 


,—m|jc| 


with the expected exponential decay for large x. 


I. 7.2 Expanding and keeping only the desired terms 


Z(J)^C 




d 4 Wid 4 u>2 


s 

4 

8 

_iSJ(wi) _ 


[iSJ(w 2 ) \ 


1 

6! 




d A xd A y J (x)D(x 


y)J(y) 


Just keep on differentiating. 


I. 7.4 Write k 1 = (\jk 2 + m 2 , 0, 0, k) and k 2 = (y/k 2 + m 2 , 0,0, —k). Then E = 2y/k 2 + m 2 > 2m. Physically, a 
pair of mesons can be produced when E > 2m. 

I. 8.1 Do the k° integral on the left-hand side of (1.8.14): f dk°S((k 0 ) 2 — a>l)9(k°)f(k°, k), where co k = 
+\/k + m 2 . Using (1.2.12) and picking up the positive root because of the step function, we obtain 
/ 0 °° dk°(8(k° - co k )/(2k°))f(k°, k) = f(co k , k)/(2co k ). 

To verify the invariance explicitly, boost in the x direction and drop the subscript on a) k \ k x —> 
sinh (j) co + cosh 0 k x and a> —> cosh 0 co + sinh 0 k x . Then, using co 2 = ( k x ) 2 + • • • and hence codco = 
k x dk x , we have dk x —> (sinh 0 ( k x /eo ) + cosh (j>)dk x . Hence dk x /co —> dk x /co. 
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I.8.2 Clearly, only the terms aa f and a^a in H contributes to < k'\H\k > . Extract these two types of terms in 


/ 


d D x(p{x) 2 


-hff 


d D q d D q' 

■J(27t) D 2a> q ^Q.n) D 2u q 


+ h.c.] 


I 


d D q 


= / -— \a(q)a~(q) + a T (q)a(q)] 

2 COg 


and so H is for our purposes effectively equal to f d D q \aiq )a l (q) + <d (q) n(q )], which upon using 
the commutation relation is equal to f d D q [<5 (C, (0) + 2a* (q)a(qj\. We recognize the first term as the 
vacuum calculated in the text. Note that the definition of the delta function (2jr ) D 8^ D \k) = J d D xe ,k ' x 
implies S^ D) ( 0) = [l/(2tr ) D ] f d D x = V/(2 tt) d . Thus, subtracting off the vacuum energy, we have H 
effectively equal to f d D qa> q a f (q)a(q), which just says that a mode of momentum q carries energy to . 
In particular, using the commutation relation twice we have < k'\H\k >= S (D) (k' — k)co k . The energy of 
a particle of momentum k is oq, relative to the vacuum. 


I. 8.4 Q = f d D xJ 0 (x) = J d D x(ip ! id 0 ip — /(3 0 y> r )y>). Focus on the first term: 

d D k' d D k 

y/(27T) D 2cD k i k /{2n) D 2co k 

[af(i + b(p)e- i ^v‘-*-* ) }v k [a(k)e- i( - a * t - i - i > - (k)e i(f0k, - U) ] 

Note that ;'3 q brings down a factor oia> k and produces a relative sign between a and /d . As in exercise 1.8.2 
the integral over x produces a delta function that collapses the two k integrals into one, giving 

J d D k^(a^{k)a(k) - b(k)b'(k) - a 1 "(-fc)fct (k)e 2ia>kt + b(-k)a(k)e~ 2ia>kt ) 

The second term in —i(do<p^)<p in JqOO is just the hermitean conjugate of the first term (p^id^ip. Thus, 
adding the hermitean conjugate of what we have just obtained, we find 

= J d D k[a^(k)a(k)-b'(k)b(k)] + S ( - D \0) J d D k 

The infinite additive constant is to be subtracted out much like the vacuum energy. In some texts a normal 
ordering operation, denoted by a pair of colons, is defined as follows: If you see :(•••)• you are instructed 
to move all the creation operators in the expression (• • •) to the left of the annihilation operators. In other 
words, by fiat: b{k)b^ ( k ): = b^ ( k)b(k ). The current is then defined by J^ix) = : id^ip — i (3 ^)<p) 

Since the normal ordered current differs from the naively defined current by a onumber the most crucial 
property of the current, namely current conservation = 0, is not affected. This is of course just a 
formal way of saying that the value of the charge in the vacuum state is to be subtracted. In any case, the 
result Q = f d D k[a^(k)a(k) — b^{k)b{k)} shows that a and b annihilate positive and negative charges, 
respectively. 



l.io.2 We have (repeated indices summed) 

R aa' R bV iD avi x ) = J D V R aa'Va'( x ) R bb'Vb'i Q )e' S 

But we can change the integration variable from ip to Rip. Since the action S and the measure Dip are 
both invariant under SO{N ) rotations, this is equal to f Dip(p a (x)tp b (0)e lS = iD ab (x). Thus, we obtain 
D ab = R aa rR bb /D a / b /. The properties of the rotation group are such that the only solution of this equation 
is D ab proportional to S ab■ 
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I. 10.3 The field V transforms as a symmetric traceless tensor (see appendix B) under SO( 3), that is, with all 
indices displayed, <p ab -> R aa ’R w <p a 'V = R aa'<P a ’b’ R l' b = ( R V RT )ab- As suggested in the hint, writing <p 
as a 3 by 3 symmetric traceless matrix we have <p —*■ RtpR 1 and thus the invariants are (up to quartic 
order in tp ) tr(3 ^ip) 2 , tr 1 p 2 , tr <p A , and (tr 1 p 2 ) 2 . Remarkably, you can prove that tr 1 p 4 and (tr tp 2 ) 2 actually 
amount to only one invariant by diagonalizing 


<P = 


(a 0 
0 p 
\0 0 


0 \ 

0 

-fr + P )/ 


You can see by computation tr ip 4 and (tr (p 2 ) 2 are both proportional to [a 2 + ft 1 + (a + /3) 2 ] 2 . Thus, if 
we restrict ourselves to quartic terms the Lagrangian C = j tr(d^<p) 2 — ^m 2 tr <p 2 — A.(tr ip 2 ) 2 actually has 
an SO( 5) symmetry (since ip has 5 components.) This is an example of what is known as “accidental 
symmetry." Convince yourself that this holds only to quartic order in ip. 


I.n.2 Varying g^ p g p x = 8% we ^ ave (&8 flp )8px = —S^idSpk)* which upon multiplication by g kv becomes 
8g^ v = —g ,ip (8gpx)g kv * You may recognize this as just the statement 8M~ l = —M~ 1 (8M)M~ 1 for a 
matrix M . To evaluate 8g we use the important identity det M = e Tr log M , which you can prove easily 
by diagonalizing M with a similarity transformation. The left hand side is equal to the product of the 
eigenvalues of M , while the right hand side is equal to the exponential of the sum of logarithms of the 
eigenvalues. {You can define the logarithm of a matrix by expanding log[7 + (Af — /)] in a power series 
in (Af - /).} 

Thus, 8 det M = (det Af) tr M~ l 8M and so 8g = gg Vfl 8g flv . We are now ready to vary 

s = J d 4 x^gl(gL lv d IA <pd v <p - mV) = J d A x^gC 

Plugging in, we have 

8S = f d*xV=H\g v »8 gltv C-gWi8g pi )g*h lt 'pdM 

Thus, 

T* v = - -L -^ = g^g vk d p cpd x <p - g» v £ 
v 8 ^8/iv 

In the flat spacetime limit 

T 00 = (d 0 <p) 2 -C= 1((3 0 ^) 2 + (Vcp) 2 + m 2 <p 2 ) 

precisely the energy density as promised. 


I.n .3 Using the expression for T from the preceding exercise, we have 

P 1 = J d?xT 0/ = — f d 3 xd 0 (pdi(p 

and 

[■ P '. ¥>(*)] = - J d 3 y[do<P(y)’ <P(x)]dMy) = idi<P(x) 

Thus, combined with the fact that P° = H, we have [P p , ip{x)\ = —id^ipix), which just reflects the fact 
that P p and x v are conjugate variables. 


I. 11.4 Evaluating T pv = -F^F* - r) pv C, we have 

Tij = ~F ix F k + \hj{E 2 - B 2 ) - —EjEj + F ik F jk + - B 2 ) 
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Since F ik l : j( . = j^ n B m B n = SjjB 2 — B/Bj, we obtain the announced result. Note that d lj T lJ = S (E 2 + 

B 2 ) = T 00 and hence T = 0. 


Part II 

Il.i .1 Continuing the hint, we have 

'W/'VV) = f\a> Xp [a Xp , = f\o) Xfi [a Xp , y^y 5 ^ 

since y 5 anticommutes with gamma matrices and hence commutes with the product of two gamma 
matrices. Inserting [cr A/0 , y^] as given in the text we have <S(0-y M y 5 0) = co^xjry x y*xlf, which is precisely 
how a vector transforms. Under parity xf/y^y^xf/ —> x//y Q y tl y 5 y 0 xJ/ which equals x[ry 5 y 0 xj/ = — 0y°y 5 0- 
for /i = 0, and xjfy°y l y 5 y°xj/ = 0y*y 5 0 for /i = i. The time component flips sign while the spatial 
components do not. Thus, the behavior under parity is opposite to that of a normal vector: xjry^y^xls 
is an axial vector. The other cases proceed similarly. 

H.i .2 From xfs L = ^(1 — y 5 )xjs and xf/ R = ^(1 + y 5 )0\ we find xjs L = i f/[y° = x//^j(l— y 5 )y° = xjs\(\ + y 5 ) and 
= — y 5 ). We then just use the properties of P L and P R repeatedly. For example, xjs L xJ/ R = 0 ^ (1 + 

y 5 )0 and xjfRx}r L = xj/j(l — y 5 )0, or equivalently, xjsxjr = xj/ixf/ R + xjfRxjs L and xjsy 5 xj/ = xjrix// R — xjr R x// L .As 
another example, + y 5 )y^j( 1 — y 5 )0 = ^Y^ji 1 — y 5 )0 and ^rY^'I'r = ^Y^jO- + 

y 5 )0. Note that various combinations vanish, for example, iAlVt, = 0, 0z,y^0/? = 0, and so on. Complete 
the exercise. 


II. 1 . 3-4 i n the appropriate basis the Dirac equation becomes 

E — m /?<r 3 \ / (p 

-pa 3 -E -m ) \x 

that is, ( E — m)(j) + pa 3 x = 0 and —pa 3 0 — (E + m)x = 0. The second equation informs us that x = 
—[p/(E + m)](j 3 0. For a slow electron x — — (p/2ra)cr 3 0, so that x is smaller than 0 by the factor p/2m. 
The first equation then reduces to (E — m — p 2 /2m)(p = 0, which just reminds us of the relation between 
energy and momentum in the nonrelativistic limit. 



11 . 1.5 In the Weyl basis the Dirac equation for a relativistic electron moving along the 3-axis E(y° — y*)xj/ = 0 
becomes 

( 0 

\/+ ct 3 0 ) \ f R ) 

Since 

<r 12 = l -ly 1 ,y 2 ]=- l -[cr 1 ,<T 2 \®I = <T i ®I = ( a ° ) 

2 2 \ 0 o- 3 / 

under a rotation around the 3-axis, t/r^ ->• = e +('/ 4 )<u,/, L while t/r,; ->• e~^^ 4)o>,j3 \fr R = 

e -^ I l 4 ) u ‘\j/ R . Indeed, the left and right handed fields rotate in opposite directions. 


11 . 1.6 In the Weyl basis, the Dirac equation y • pu = 0 becomes cr^p^rj = 0, and cr^p^x = 0, with 



The solutions are 



p° + p 3 


and 77 = 


p l + ip 2 
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for the two possible helicities. The corresponding solutions for x may be obtained by p o —p. We have 
uu = p ■ p = 0. For a particle moving in the +3 direction, 77 = 0 and 



W 


and x = 0* The Lorentz vector uy^u = (2£) 2 (1, 0, 0, 1). (What other direction could it point in?) For 
a particle moving in the —3 direction, 77 and x exchange roles. This exercise shows explicitly that for 
massless particles we can use 2-component spinors. (What happens if parity is broken?) 

11 . 1.8 In either the Dirac or the Weyl basis, (t/t c ) c = y 2 (y 2 \J/*)* = \Js. 

Il.i.g It is easiest to work in either the Dirac or the Weyl basis. Let 1 Jr be left handed, that is (1 + y 5 )i/r = 0. 
Then (1 — y 5 )*J/ c = (1 — y 5 )y 2 \J/* = y 2 ( 1 + y s )iJ/* = 0 since y 5 is real. 

II.I.IO x/rCf -4 = tct since (a Xp ) T C = -Ca kp . 

II. 1.12 Under parity or reflection in a mirror, x 1 —> x 1 and x 2 — y —x 2 . Choose y° = a 3 , y°y 1 = cr 1 , and y°y 2 = 
a 2 . Multiply the Dirac equation {iy^d^ — m) V/ = 0 by y° and write [i(3o + y°y' 3;) — y°m]xfr = 0. Then 
multiplying by cr 1 reverses the sign of the 3 2 term, but also the mass term. I leave it to you to discuss 
time reversal. 


11. 2.1 Apply Noether for the transformation 

i/r —>■ e l0 \js = (1 + Z 0 )l/r 


Then 


SC 

8 ( 8 ^) 


Sxjf + 


SC 

s(W 


8x]s = i/ay M (z$i/r) 


Note that formally C does not depend on 3^,1 fr. Thus, up to overall factors we can choose J l> = i// y 11 '// 
with the corresponding charge Q = f d^xxjsy 0 ^ into which we plug (II.2.10) 


f(x) = 


/ 


d 3 p 

(2n)V 2 (E p /m)V 2 


s)u(p, s)e ,px + d l (p, s)v(p, s)e tpx ] 


At this point, the calculation pretty much parallels what you did in exercise 1.8.4. The integration f d 2 x 
over space produces a delta function that sets the momentum variables in and in \j/ equal to one 
another. The new feature here is that we encounter objects such as uy Q u. Invoking Lorentz invariance and 
referring to the rest frame form of u and v wehave i/(p, s)y^u(p, s') = 8 ss >p^/m, u(p, s)y^v{p, s') = 0 , 
and so on. We obtain 


Q = 


/ 


d^p 

c 2rc)\E p /m) 


s)b(p, s ) + d(p, s)d Jf (p, s)] 

S 


As in exercise 1.8.4 we have to move the creation operator d T to the left of the annihilation operator d 
and subtract off an infinite constant. Thus, finally 


Q = 


/ 


d 3 p 

(2rt)\E p /m) 


s)b(p, s) 


d l (p, s)d(p, ^)] 


showing clearly that b annihilates a negative charge and d a positive charge. 
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To calculate [Q, i/r(0)] = f d 3 x[xjr(x)y 0 i//(x), </r(0)] we use the identity [AB, C}= A{B, C) — {A, C}B 
and the canonical anticommutation relation (11.2.4). We find [Q, \fs( 0)] = —\j/(0), thus showing that b 
and d t must carry the same charge. 


11. 3.4 The desired equations are y^a/i = 0 (this takes out 4 components since a takes on 4 values) and 

(P - m)%Vpu = 0 (for each /i this takes out 2 components and so altogether 4x2 = 8 components.) 
Thus, 16 — 4 — 8 = 4 components as desired. Another way of saying this is that is a Dirac spinor 

and hence the spin j part of the vector-spinor 4^. 

1 1 . 6.4 It is good practice to be as symmetrical as one can in calculations. So define p 3 = —P 1 and p 4 = — P 2 and 
add the 6 (not 3) combinations appearing in the definitions of s,t, and u, thus obtaining 

2 (s + t + u) = (pi + P 2 ) 2 + (p-i, + P 4) 2 + (/?3 + Pi ) 2 + ( P 4 + P 2) 2 + (P 4 + Pi ) 2 + {pi + P 2 ) 2 
4 

= 3 ^2 + ^(^1 ' Pl + Pl ■ Pa + Pi • P3 + Pi * /?4 + Pi • /?4 + P2 • P 3 ) 

i=l 

The second group of terms on the right-hand side collect into ( Yl*=i p ,) 2 — (Obviously, we 

have for convenience changed notation slightly, setting ra 3 = and m 4 = M 2 .) 


II. 6.5 Referring to (C.ll) we see that in da the factor 

-- ^ - y -...- y - (2jr)4 

\V 1 - v 2 \£(Pi)£(P 2 ) (2 n) i £(k 1 ) (2n) i £(k n ) 

reduces to |(m/.E) 4 [l/(2jr) 2 ]. Integrating the factor d 2 P l d 2 P 2 S < ^ (p 2 + p 2 — Pi — P 2 ) over P 2 we knock 
off 3 of the delta functions, leaving us with dQd P\P 2 & (2 E — E{), and so the integral over P 2 gives j P 2 . 

Finally, the factor containing the “real physics” is \ X] JZ |A4 | 2 = (e 4 /4m 4 )/(0). Multiplying the three 

* S 

factors together and dividing by dQ, we obtain da/dQ = (j) 5 [c 4 /(2ir) 2 ](l/P 2 )/(@), as given in the text. 
Note that m cancels out as expected. We should be able to take the limit m -> 0 compared to the energies 
without the cross section either blowing up or vanishing. 


II.6.7 


r = 


\m \ 2 f d 2 k d*e 4 . 4 ,. , ./ , 

2 M J ( 2 tt ) 3 2 o ) (2n) 3 2to' ( 9 


Knock off the integral and do the angular part of the k integral to obtain 

\m \ 2 r dkk 2 . 


r = 


8 j tM 


f — ~—S(y/ k 2 + m 2 + y/ k' 2 + m 2 — M) 
J aiw' 


Using (1.2.12), we evaluate the integral as (k 2 /a>co')( l/(^ + ^)) = k/M. Solving V k 2 + m 2 +\/k' 2 + m 2 = 
M for k we obtain the stated result. 


Part I 


11 1 . 1.2 The amplitude should become nonanalytic when both denominators of the integrand 
(k 2 — m 2 + is)((K — k ) 2 — m 2 + is) 

vanish, namely when k 2 = m 2 and (K — k) 2 = in 2 . But we found the condition in exercise 1.7.4, namely 
that K 2 > 4m 2 . Referring to (III.1.14) 


M = 


ik l f 
~32n 2 Jo 


da log 


A 2 


a{l — a)K 2 — m 2 + is 


we see that the log has a cut starting at K 2 = m 2 /a(l — a ). As a ranges from 0 to 1, the minimum value 
of m 2 /a( 1 — o0 is attained at a = j. So indeed, the cut starts at K 2 = 4m 2 . 
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111.1.3 


111.2.1 


111.3.3 


111.3.4 


111.3.2 


111.5.3 


m.e.i 


111.6.2 


111.7.1 


Under the indicated change, log A log e e A = log A + e, and so SM = —iSX + iCX 2 3(2e) + 0(X 3 ). 
Thus, SM = 0 implies SX = 6CX 2 s + 0(X 3 ) = 6CX 2 S log A + 0(X 3 ) giving the stated result for 
A(dX/dA). 

For f d d x(d<p) 2 to be dimensionless, we need [<p] = (d — 2)/2. Thus [(p n ] = n(d — 2)/2 and so in order for 
f d d xX n ip" to be dimensionless, we must have [A.„] = n(2 — d)/2 + d. 


When we set m = 0, the integrand is manifestly a linear combination of y matrices. The integral cannot 
produce a term independent of the y matrices, which is what B is. For electrodynamics, the integral is 
changed to 


(ie) 2 i 


•7 


d 4 k 1 k„k v 

-[d 


( 2 tr ) 4 k 2 


+ m 

(p + k) 2 — m 2 


Y 


= A(p 2 ) p + B(p 2 ) 


When m = 0, the integrand is a linear combination of the product of three y matrices, which can only 
reduce to one y matrix, not to none. Incidentally, an alternative way of seeing the results stated here is 
to recall from chapter II.l that with m = 0 the Lagrangian is invariant under the chiral transformation 

iff e Wy *f. 


This essentially follows from D = 4 — B E — jF e for B E = 0 and F E = 2. Then D = 1 but the linear 
divergence is reduced to logarithmic divergence by the symmetry argument given in the text. 


Basically, you have already done this problem in exercises 11.1.3 and 11.1.4. You merely have to replace 
E and p by 3/3 1 and V (see also chapter III. 6 ). 


In nonrelativistic quantum mechanics, the scattering amplitude is given in the Born approximation by 
i times the Fourier transform of the potential: i f d^xe lk ' x U (x). The scattering amplitude owing to the 
exchange of a scalar meson of mass m is just i/(k 2 — m 2 ) ~ —i/(k 2 + m 2 ). Thus, we just repeat the 
calculation in chapter 1.4, obtaining 


U(x) = - 


-/ 


d}k 


(27t) 3 k 1 + m 2 


47rr 


u(p')(p'y ** + y 11 p)u(p) = 2mu(p')y' i u(p) by the equation of motion, but using y^y" = y v } + 

j [y 11 , y v ] = (/“' — ia^ ,v we can also write (p'y^ + y 1 * p) = (;/ + p)^ + icr' xv (p' — p) v . We thus obtain 
the Gordon decomposition. 


We compute 

q t iU(p')[Y IJ 'Fx(q 2 ) H-— —F 2 (q 2 )]u(p) = u{p') 4u(p)F l (q 2 ) 

2m 

= u(p')(p'- p)u(p)F 1 (q 2 ) 

= u(p')(m — m)u(p)F 1 (q 2 ) = 0 

where the first and third equality follows from the antisymmetry of o and the equation of motion, 
respectively. 


Proceeding as in the text but living in d —dimensional spacetime, we obtain iYl fMV (q) = —i f yy- 
where y = j q 1 dot y with V = (l 2 + — a)q 2 — m 2 + is) 2 as before but with N^ v now effectively equal 


to —d{{ 1 — |)g „ v / 2 + a(l — a)(2 q„q v — gu V q 2 ) — m 2 gu, v )- Rotating to Euclidean space we see that we 

2 p d{i 


have to do the integrals (with c 2 = m 2 - a(l - a)q 2 ) f ^ and / ^ = f ofr q^) ~ 

c 2 f d ^ 2 ^ 2)2 • I did the first of these integrals for you in appendix II in chapter I II.l. Generalizing 
slightly we have / 0 °° = \ c d ~ 2a Jq dx (1 — x) i %. I will let you carry on from here. 
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Part IV 


IV.i.i Write ip = ((py, ip 2 ,..., v + ip' N ). We compute \p 2 (p 2 — (A/4 )(^ 2 ) 2 up to 0(ip 3 ) and find (upon dropping 
the ') 

\p 2 (v 2 + 2 i i(p N + ip 2 ) - ^(it 4 + 4v 2 <p 2 N + 4d 3 < p N + lv 2 ip 2 ) 

The condition of no linear term in ip N fixes v 2 = // 2 /a and so the coefficient of 1 p 2 is equal to 2 ji 2 — 
X/4(2v 2 ) = 0. The (N — 1) fields ipy, ip 2 , . .. , <Pn-i are massless. 


IV. 3.1 Wehave / 0 °° dk log [(( 2 + « 2 )/( 2 ] = na and so V e g(<p) = V(ip) + lis/V"(<p)/2 + 0(h 2 ). For C = \(d(p) 2 - 
jU> 2 ip 2 , the quantum oscillator with (p identified as position, we have V e ff(0) = 

IV. 3.3 We have 

m((p) = f(p in V F (<p ) = 2 i f ^ log P 2 ~™W 2 
which after Wick rotation becomes 


-2 


/ 


d 2 p E p\ + m(ip) 2 


(2n) : 


log 


P 2 e 


1 r°° 

= ~— / dx log 
2tv Jo 


x + m((p) 2 


After cutting off the integral at A 2 and adding a counterterm B<p 2 we obtain 


r,= T(/*» 2 io s ^ 


IV. 3.4 V e ff = i I d 4 k/(2n) 4 (l/2n)[V"(<p)/k 2 ] n . For V"(<p) = \k(p 2 the corresponding Feynman diagrams 
consist of a circle with n V’s attached to the circumference, where the 2 n is the infamous symmetry 
factor that I tried to avoid talking about in chapter 1.7. 


IV. 4.1 With H an arbitrary p-form, 


ddH = —9a3 „ dx x dx v dx^dx tl2 .. . dx^r 

(p + 1 )! p\ 


1 1 1 
2 (p + 1 )! p\ 


[ 9 a . S v ]H )llll2 _ ll dx x dx v dx^dx' 22 . .. dx^p = 0 


IV. 5.1 If you have done all the exercises thus far (see exercise 1.10.3), you have already made the acquaintance of 

the 1 = 2 scalar field transforming as < p ab —> R ac R bd (p cd = R ac (p cd (R T ) db = (R(pR T ) ab , which thus can 
be written as a traceless 3 by 3 symmetric matrix (p —> R(pR T . Now you merely have to write out the 
covariant derivative D^(p (see IV.5.20) explicitly. [Hint: The action of the generators on cp is similar to 
what is shown in (B.20).] 


IV. 5.2 dF = d{dA + A 2 ) = dAA — Ad A and [A, F] = Ad A — dAA and so dF + [A, F] = 0. Explicitly with 
indices, this reads s^ vXa {d v F Xa + [A v ,F Xa ]) = Q. In the abelian case, we have, for p = 0, e’^dj F Jk = 
V ■ B = 0 (recall chapter IV.4!), and for p = i, —9 0 Fj k + djFok ~ djFko) — —doPi + (V x E)i — 0. 


IV-5-4 From the general arguments mentioned in the problem we know that tr F 2 must be the “d of some¬ 
thing.” Now d tr Ad A = tr dAdA and d tr | A 3 = ^ tr(dAA 2 — Ad A A + A 2 dA) = 2 tr dAA 2 but on the 
other hand tr F 2 = tr (dA + A 2 ){dA + A 2 ) = tr {dAdA + 2 dAA 2 ) since tr A 4 = tr A 3 A = — tr AA 3 = 
— tr A 4 = 0. In electromagnetism, tr A 3 = 0, and d tr Ad A when written out in elementary notation 
is just = le^F^. 
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IV-5-6 We simply plug in the general expression in the text and obtain 

C = - m)q 

with the covariant derivative D^ = 3^ — i A ^ = 3^ — iA a T a , where T a (a = 1. 8 ) are traceless her- 

mitean 3 by 3 matrices. Explicitly, (A^q) 01 = A“ (7°)^^, with a, /? = 1, 2, 3 (see chapter VII.3). 


I V. 6-3 Observe 


A“ T a — 


A l+i '2 
A A* 


i 1—*2 


—A 5 


with the obvious notation A 1±l2 = A 1 ± iA 2 . Let (<p) = ( ® ) so that 

A 4 A 4 A 4 \ v / 


D ixV = 3 n<P ~ HgAl ^ 


^r' 2 \ 

-g^ + g'sj 


Thus, ID^ipl 2 contains u 2 [g 2 Ay , 2 A 2 l2 + (—gA 3 + g'B^) 2 ]. The combinations A 2+ ' 2 , A 2 ' 2 , and 
(—gA 3 + g'B,,) acquire mass while (g'A 3 + gB^) remain massless. 


IV. 7.4 We have 

/• d 4 v N^ 

A j2V (k 1 ,k 2 ) = i / ——-7 —-—\r v,ki) 

J (2rr) 4 £) 

where 

1V ,1V = tr y 5 (/i — 4 + M)y v (p- ]jt x + M)y IJ '(p + M) 


Only the term linear in M in N^ v does not vanish, giving N** v = 4i Me ll ' lax k] a k 2 X . Since we are interested 
only in terms of 0(fcjk 2 ) we can set D -> (p 2 — M 2 ) 3 so that 


A^ v (fc 1 , k 2 ) = —&Me llvaT k lrJ k 2T 


/ 


t/ 4 p 


(2jt) 4 (p 2 — M 2 ) 3 4n 2 M 


-S^k^kj 


with a dependence on M as stated in the problem. The effect of the regulator, like some unsavory 
acquaintance, remains even after we have sent him to infinity. 


IV. 7.5 We will sketch the solution. The details may be found in the lectures given by S. Adler at the 1970 
Brandeis Summer School. The point is to imagine a regularization scheme that preserves the various 
relevant symmetries, namely Lorentz invariance, vector current conservation, and Bose statistics. As you 
will see, we don’t actually have to specify the regularization. By Lorentz invariance, we have 

A^*!, k 2 ) = + e^ v °k 2a A 2 + s^k la k 2x k\A, 

_l_ _|_ £ kvar 4 5 

+ e Xvax k Xa k lx k^ A 6 + e^ vax k Xa k lx k\A 1 

+ ^^ y<TT ^lcr^2r^2^8 

Since the Feynman integral representing A Xflv is superficially linearly divergent, we see that A 3 , . .., Ag 
are all convergent since we have to pull out three powers of momentum to extract them. In contrast, A 3 
and A 2 are logarithmically divergent. But we can relate them to A 3 ,..., Ag by vector current conservation 
since 0 = & 1/x A XfMV = s kv<TX ki a k 2r (—A 2 + A 5 + k 1 • k 2 A 6 ) and thus A 2 = fcjA 5 + £3 • & 2 A 6 . Similarly for 
A 1 . Rationalizing the Feynman integrand and evaluating the trace in the numerator, we can systematically 
ignore terms that contribute only to A! and A 2 . Furthermore, Bose statistics gives us relations such as 
A } {k 2 , k 2 , q 2 ) = -A b (k 2 , k 2 , q 2 ). 
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Part V 

V.i.i 

v.5.1 


V.6.1 


V.6.2 

V.7.4 

V.7.5 


We dropped the term /j 2 3 0 S but kept the term 4g 2 ph 2 . This requires 3 0 6* <5C g 2 p, that is, on <K g 2 p, but 
since in our solution co ~ gy/p/mk this requires k <5C gy/mp, which is consistent with what we assumed 
about k. Looking at the terms —lyfphdoO — Ag 2 ph 2 in C we see that h ~ do&/(g 2 */p) <5C \Tp> which is 
also consistent. 


With y 5 = 03 , j (I ± y 5 ) clearly projects out the top and bottom component of ifr = ( j, respectively. 
Everything is formally the same as in chapter II.1, but we can also work things out explicitly in the 
specific representation given here. Thus, 1 /r</r — — V ; J_r * and V'y 5 '/'' = — 

L + 'A/h- Linder the transformation v// — e l<>y i//, \fr / —> e ,e i/f L and <//^ —> e~ ,e \l/ R , and the 

massless Dirac Lagrangian 

+ 9 9 + 9 9 

£ = 'V'rC— + V F — )^R + - v F — )f L 

at ox ot ox 

clearly does not change. 


This of course just follows from Lorentz invariance. We have 

x-vt -v , x - vt 

n - . ) = 


Vl — n 2 Vl — n 2 Vl — 


and 


x - vt 1 , x-vt 

Vl^ “ VT^^ VFV? 

and thus the equation 

,a ? -a>,^, + vv<^o 

becomes 

„ x — vt , x — vt 

cp'\^==)-V'[cp(^==)]=0 


VT~ 


VT“ 


Note that this does not depend on the form of V. For any relativistic theory, the soliton moves like a 
relativistic particle (obviously!). 


The sine-Gordon theory has an infinite number of vacua occurring at <p = {In + 1 )n/p. Thus, there 
exists a whole spectrum of solitons, such that <p(±oo) = (2n± + l)n/p. The topological current is 
jv = (fi/27T)£ llv d v (p with the corresponding charge Q = ( n + — n_). The Q = 2 soliton decays into two 
(2 = 1 solitons. 

{i/2n) f s i gdg t = (// 2tt) f s i e lve de~ lv9 = (i/2n) f s i(—ivdd ) = {v/2n) d6 = v , which indeed counts 
the number of times winds around the circle. What mathematicians call the winding number is 
indeed just the magnetic flux of the physicist. 


Within a region small enough so that we can treat (p a = v8 ai as constant, using ( D = 3 ^(p b + 
es bcd A c ^(p d we have (D^cp) 1 = evA ^ and (D^(p) 2 = —evA 1 ^ and thus 

_ q/e)s abc V “{D llV ) b (D v( p) c 

w\ M 3 

- + *( A X - = 9 , A v - ^ 

precisely the electromagnetic field strength since A^ is the massless component of the Yang-Mills field. 
Let us compute B k = Sijk^ij far from the magnetic monopole. To calculate the magnetic charge we are 
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interested only in the term of order 1/r 2 in B. Since D^<p —y 0(1/ r 2 ) by construction we can drop the 
second term in Tjj. Thus, we merely have to compute F°.= 3 ; A“ — dj + ee abc A *A 5. Since Ffj will 
eventually be contracted with the unit vector ^ a /|^| = x a /r, we can effectively drop some of the terms 
in Ffj, thus simplifying the computation. We have 

a i A a . = d i {-8 a ^r = "-s a ^ i - 1 

J e r L e r l 

and 


/i / / y\ 0 abc c bim c .cjn y .m v n 
e£ abc A b A c _ (l/g)g g S J X X 


(l/e)(8 ci 8 am - 8 cm 8 ai )£ c ^ n x m x n 1 


so that 


\<p\ 


F a x a i i 

— = -T (-2 + l)e 0! V = —^-e 0! V 

r er 3 er 3 


and hence #£ = — (1 /er 2 )x k . The magnetic charge g = —An/e. 

Our result appears to differ from Dirac’s quantization condition (IV.4.10) by a factor of 2. The 
resolution of this apparent paradox is instructive. In fact, we can always introduce into this theory a 
field A 1 (which could be a Bose or a Fermi field) transforming in the I = \ representation with the 
corresponding covariant derivative D^A? = d^A* — ie(jT a )A < f i AJ. The field 'P carries electric charge je. 
Thus, the fundamental unit of electric charge is actually \ e , not e , and our result g = —An / e = —2n / (e/2) 
is actually nothing but the Dirac quanization condition. (The sign is trivial: just a question of which one 
we call the monopole and which the antimonopole.) 


V.7.7 Plugging in the Ansatz (p a = (H (r) /er)(x a / r) and A b t = [1— K (r)]s bl i (x*/er 1 ) [so that//(r) —* evr and 

K(r ) —> 0 in accordance with the asymptotic behavior (V.7.5) and (V.7.6)] into M = f d 3 x{|(F J7 ) 2 + 

r—> 00 ^ J 

j(Dj(p) 2 + V (y>)} we get M as a functional of FI and K. Minimizing M gives (with H' = dH/dr etc) the 
equations r 2 H" = 2HK 2 + (X/e 2 )[H 3 — (ev) 2 r 2 H\ and r 2 K" = K(K 2 — 1) + KH 2 . For help, see M. K. 
Prasad and C. M. Sommerfeld, Phys. Rev. Lett. 35: 760, 1975. 

V.7.8 The BPS solution corresponds to setting X = 0 in the two equations in exercise V.7.7, rendering the 
equations soluble, with the solution H(r ) = evr {coth evr ) — 1 and K (r) = evr /(sinh evr). Ask yourself 
why H(r) and K (r) approach their asymptotic values exponentially. What determines the length scale? 

V.7.9 For help, see B. Julia and A. Zee, Phys. Rev. Dll: 2227, 1975. 

V7.11 We derived the lower bound for the mass of the magnetic monopole Anv\g\ ~ An(ev)/e 2 ~ M w /a. 

V.7.12 Near the identity element g = c l0 ' n ~ 1 + i8 - a and thus gdg' — —idd - a. In a small neighborhood of 
the identity element the group manifold is locally Euclidean and so 

trlgdg 1 ) 3 = i tr (a 1 a 2 a k )dd'd6 2 dd k = — 12dQ 1 d9 2 d9 3 

is manifestly proportional to the volume element on S 3 . For g = e'( 0 i ,7 i+ 0 2 a 2+ m0 3 ,7 3\ tr(gdgf ) 3 = 

- 1 2mde 1 de 2 d9 i . 

V- 7-13 f d A x(d^ J 0 ) = J d 2 xJ°\ l=+co - f d 3 xj° b=_oo- Recalling that we see that the two 

spatial integrals just count the number of right moving fermion quanta minus the number of left moving 
fermion quanta at t = ±oo respectively. So f d 4 x(d^J^) is an integer. On the other hand, in the text we 
proved that f tr F 2 is a topological invariant. In other words, with suitable normalization, evidently 
l/( 47 r) 2 ,the integral [l/( 47 r) 2 ]/ d A xe txvka tr F^F^ is an integer. Thus, the coefficient 1 /( 47 t ) 2 cannot 
be shifted even a little bit by quantum fluctuations. 
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Part VI 


VI. 4 .Z The quartic interaction term (1/2 f 2 )(n ■ dX) 2 in C gives the amplitude i(1/2 f 2 )i 2 t> ab & cd (k\k$ + k jfc, + 
k 2 k 2 + k 2 k 4 ) + permutations = (i /2 f 2 )S ab S cd (k ] + k 2 ) 2 + permutations for the 4-pion interaction vertex 
(where for convenience we have labeled all the momenta as going outward so that ki~\-k 2 -\-k 3 + = 0 ). 


Vl. 4.3 After writing a = v + o', we find, as in chapter IV. 1, that C = — j(2fi 2 )cr' 2 — Xvo'if 2 — ^A(jr 2 ) 2 + 
• • • , where we have displayed only terms relevant for our purposes. The diagrams contributing to 
four-pion interaction are of two types, those involving the A.(if 2 ) 2 term and those invoking o' ex¬ 
change. The former gives for the amplitude (—^A)2 • 2 (8 ab 8 cd + 8 ac 8 bd + 8 ad 8 bc ) while the latter gives 
(— ikv) 2 {2i/[(k\ + k 2 ) 2 — m 2 ,]}8 ab 8 cd . Thus, expanding to first order in momenta squared we find the 
coefficient of 8 ab 8 cd : 


-ik - 2ik 2 v 2 



(1 + 


(kl+ 2 k2)2 ) = -ik + 
m z , 

o' 


2ik 2 v 2 

2/j, 2 


^ _l_ (k 1 + k 2 ) 2 
2 fi 2 


ik 

2 p 2 


(ki + k 2 ) 2 


To compare with exercise VI.4.2 we remember that / 2 = v 2 = fi 2 /k so that the amplitude here is also 
equal to (i/2f 2 )S ab S cd (ki + k 2 ) 2 + permutations, as we had anticipated in the text. 


VI. 4.4 We will track down factors of 2 carefully but not factors of i and —1. Let us go back to the chiral 
transformations \/f ► [1 + id • (?/2)y 5 ]i/r and i jr -*■ t/r[1 + W ■ (f/2)y 5 ]. Thus, Six/nfr) = 9 a \jriy 5 T a \l/ and 
S( t]/iy r> T l ‘<j /) = —9 a yjr/f. Hence, for C — xjr{iyd + g(a + it ■ iry 5 )}i/f + C(a , if) to be invariant we must 
havener = 9 a ji a and^^- a = —9 a a. Applying Noether’s theorem J^ = (SC/Sd^cplStp, we obtain the current 
= i//'}/ /( y^(r"/2p// + jr fl 3 Al cr — written in the text. Comparing the term piy^Ysn contained in 

J^ 12 = J 2 5 + i J 2 5 with the current J 5/Ji defined in chapter IV.2, we see that = —i J^‘ 2 . The normal¬ 
ized state |jr“) = (l/y/2)( |zr x ) — i |tt 2 >) so that (0| ;r 1+ ' 2 |tt^> = 2/^/2. The current J 2 ^' 2 contains the 
term — i)3 ( ,, jr 1+ ' 2 and thus / = V2t>. Next, we have to work out the pion-nucleon coupling .y.y as de¬ 
fined in chapter IV.2. Here £ contains gxj/ir ■ Tzyyj/, which contains \/2y piyyiii since n 1 l2 = ■Jill . 
Thus, gj rNN = y/lg. Putting it together, we see that M = gv translates to 2 M = fg n ^^ in agreement 
with chapter IV.2. 

Vl. 6.1 See figure VI.6.1. From Ah = {d/ cos 6) — d{ 1 + \Q 2 ) and ( dh/dx ) = tan 0 ~ 6, we have ( dh/dt ) oc 6 2 oc 
(dh/dx) 2 , thus giving rise to the term (A/2)(V/z) 2 . It all goes back to Mr. Pythagoras. 

Vl. 6.2 We integrate the term \ f d D x dt[((d/dt) — V 2 )/i ] 2 in S(h) by parts to obtain — \f d D x dt[h((d/dt) + 
V 2 )((3 /dt) — V 2 )h]. Thus, the propagator is the inverse of the operator (3/3 1 + V 2 )(3/3 1 — V 2 ) = 
d 2 /dt 2 — (V 2 ) 2 , the Fourier transform of which is —(co 2 + k A ). 

Vl. 8.3 With h'{x, t) = h (3c + gut , t) + u ■ x + ^u 2 t, we have dh'/dt = dh/dt + gu ■ Vh + ( g/2)u 2 and Wh' = 
V/i + u. Thus the combination (dh/dt) — | (Wh) 2 is invariant, as is (obviously) V 2 /z. In other words, 
S(h) must be constructed out of these two invariant combinations. 

Vl. 8.5 Look at the action S(h) = \ J d D x dt [(dh/dt) — V 2 /z — g(Vh) 2 / 2] 2 . Comparing dh/dt — V 2 h we see that 
time has the dimension of length squared: T ~ L 2 . From the term f d D x dt (dh/dt) 1 and the fact that 
S is dimensionless, we have [h] 2 ~ T 2 /(L D T) ~ 1/L°~ 2 and so h has the dimension of (l/L D- 2 ) 2 . 
Comparing W 2 h with g(Vh) 2 we see that g has the dimension of 1/h, that is, £ (D ~ 2) / 2 . 

VI.8.7 We are told that L(dg/dL) = (2— D)g/2 + (2D — 3 )fog 3 + • • • . We are assuming that the terms 
(• • •) can be neglected. Thus (in what follows a 2 and b 2 are two generic positive numbers) for D = 1, 
L(dg/dL) = a 2 g — b 2 g 3 and g flows toward the fixed point g* = a/b. (Incidentally, the KPZ equation 
is soluble for D = 1 by methods not explained in this text and both z and x are known exactly.) For 
D = 2, L(dg/dL) = b 2 g 3 and g flows toward some unknown strong (presumably) coupling fixed point. 
For D = 3, L(dg/dL) = — a 2 g + b 2 g 3 . The fixed point g* = a/b is unstable. For g < g*, g flows toward 
the trivial (i.e., free, or Gaussian) fixed point. Since the theory at the fixed point is free we know the 
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critical exponents exactly: z = 2 and x = (2 — D)/ 2. For g > g*, g flows toward some unknown strong 
(presumably) coupling fixed point. 


Part VII 

VIl.i. 1 . Set n^A'^ix) = 0 with A^ = U^A^U + iU'^d^U, so that n • dU(x) = in ■ A(x)U(x). Define A(x) = 
r • x/(r • n) for any 4-vector r and write x = A .{x)n + x±, so that r • xj_ = 0. Then 

U(x) = Ve‘ fo Mdan ' A ^ n + x - l > 

(with a path ordering) solves the differential equation, since n • 3 A = 1 by construction. 

VII. 1.2 Using the BHC formula given, we have (the V’s are clearly irrelevant) 

jj U = g iaA iJ.e iaA v — g/fl^A^+Ay)— A v ]+a 3 C+fl 4 Z)+0(fl 5 ) 

l 7 jk 

Similarly, 

U kl Ui — e~ iaA '^e~ iaA ' v — e ~ ia ^ A 'fi +A 'v^~i a2 ^ A fj- >A v]+ aiE + a4F +°( a5 ) . 

the prime reminding us that the A^ and A v in this expression is to be evaluated on the “north” and 
“west” side of the plaquette in figure VI1.1.2, respectively, in contrast to the A ^ and A v in Uj k which are 
evaluated on the “south” and “east” side, respectively. Here C, D, E, and F denote various commutators, 
which we drag along merely to show that they eventually drop out in what interests us. (Note how the 
different terms are associated with different powers of a, as indicated. Note also that in some places we 
have dropped the prime on A and absorbed the “error” in doing so into terms of higher order in a.) Thus, 

U k iU n = ^- /fl ( A M +A v)-^ 2 (3vA M -9 /i A v -|i[A M ,A l ,])-l-fl 3 G+a 4 //+0(a 5 ) 

where G and H denote sums of commutators and terms such as d v d v A /i and d v d v d v A /1 . Applying the 
BHC formula again to the order indicated we have 

UijUjkUkiUn = e ial ^^ Ax ~^v A fi)~ a2 [ A iJ,’ A v]+0(a 4 ) _ gia^F^+a^I+a*J+0(a^) 

with F^ v = d jX A v — d v A fJi + ^[A^, A v \. The same remarks on G and H apply to I and J. The Yang-Mills 
field strength emerges naturally, as we would anticipate. Since the traces of commutators and of A vanish, 
when we apply the trace all the junk drops out to 0{a^) and we have 

S(P) = Re tr[l - + 0(a 5 )] 

By gauge invariance, the corrections must be of even order in a but for our purposes we don’t care about 
them anyway. Evidently, / and g are related by some uninteresting factors of a. 


Part VI 11 


VII l.i .7 R 12 = da) 12 = d(— cos 6d(p) = sin 0d6d(p = 2R^ 2 d6d(p =>• R ^ 2 = ^ sin 6. Since e® = 1, = 1/ sin 6, 

we obtain R = R a ^a e b = ^^dt e i e 2 = independent of 6 and (p as expected. 


Part N 

N .1.1 . The effective action for an electrically neutral system is given in the point particle limit by S = f dr(—m + 
b^E^E 11 + bgB^B 1 ^ + • • •), with E^ and B^ defined in the text. The interaction terms involve two powers 
of derivatives, which translate into two powers of co in the scattering amplitude and hence four powers 
of co in the scattering cross section. (Note that a possible term like f dxF^F^ can be absorbed into the 
two terms already displayed.) 


N.3.2. As in (III.3.7) we have 3V 3 + 4 V 4 = 2/ + n where I denotes the number of internal lines. The number 
of loops (III.3. 6 ) L = I — (V 3 + V 4 — 1) is 0 in a tree diagram. Thus V 3 = n — 2 — 2V 4 < n — 2. 
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Books on field theory 

This is a list of field theory textbooks that I know about. I do not necessarily recommend them all. 
In food as in books, each has his or her own taste. 

T. Banks, Modem Quantum. Field Theory, Cambridge University Press, New York, 2008. 

J. D. Bjorken and S. D. Drell, Relativistic Quantum Mechanics, McGraw-Hill, New York, 1964. 
-, Relativistic Quantum Fields, McGraw-Hill, New York, 1965. 

L. S. Brown, Quantum Field Theory, Cambridge University Press, New York, 1992. 

S. J. Chang, Introduction to Quantum Field Theory, World Scientific, Singapore, 1990. 

T. P. Cheng and L. F. Li, Gauge Theory of Elementary Particle Physics, Clarendon Press, Oxford, 1984. 

F. Dyson and D. Derbes, Advanced Quantum Mechanics, World Scientific, Singapore, 2007. 

R. P. Feynman, Quantum Electrodynamics, W. A. Benjamin, New York, 1962. 

K. Huang, Quantum Field Theory, John Wiley & Sons, New York, 1998. 

C. Itzykson and J-B. Zuber, Quantum Field Theory, McGraw-Hill, New York,1980. 

T. D. Lee, Particle Physics and Introduction to Field Theory, Taylor & Francis, New York, 1981. 

V. P. Nair, Quantum Field Theory, Springer, New York, 2005. 

M. E. Peskin and D. V. Schroeder, An Introduction to Quantum Field Theory, Perseus, Reading MA, 
1995. 

L. H. Ryder, Quantum Field Theory, 2nd Ed., Cambridge University Press, New York, 1996. 

M. Stednicki, Quantum Field Theory, Cambridge University Press, New York, 2007. 

G. Sterman, An Introduction to Quantum Field Theory, Cambridge University Press, New York, 1993. 

S. Weinberg, Quantum Theory of Fields, Vols. 1 & 2, Cambridge University Press, New York, 1996. 
X. G. Wen, Quantum Field Theory of Many-Body Systems, Oxford University Press, New York, 2007. 

and finally, of course, 

F. Mandl, Introduction to Quantum Field Theory, Interscience, New York, 1959. 
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Books on various topics mentioned 

A. A. Abrikosov, L. Gorkov, and A. Dzyaloshinski, Methods of Quantum Field Theory in Statistical 
Physics, Prentice Hall, Englewood Cliffs, NJ, 1963. 

S. L. Adler, “Perturbation Theory Anomalies,” in: Lectures on Elementary Particles and Quantum Field 
Theory, 1970, Brandeis University Summer Institute in Theoretical Physics, S. Deser et al, ed., 
MIT Press, Cambridge, 1970. 

P. Anderson, Basic Notions of Condensed Matter Physics, Benjamin-Cummings, Menlo Park, CA1984. 

D. Bailin and A. Love, Supersymmetric Gauge Field Theory and String Theory, IOP Publishing, Bristol 
and Philadelphia, 1994. 

R. Balian and J. Zinn-Justin, eds., Methods in Field Theory, North Holland Publishing, Amsterdam, 
and World Scientific, Singapore, 1981. 

A. L. Barabasi and H. E. Stanley, Fractal Concepts in Surface Growth, Cambridge University Press, 
Cambridge, 1995. 

D. Budlcer, S. J. Freedman, and P. H. Bucksbaum, eds., Art and Symmetry in Experimental Physics: 
Festschrift for Eugene D. Commins, American Institute of Physics, New York, 2001. 

J. Cardy, Scaling and Renormalization in Statistical Physics, Cambridge University Press, New York, 
1996. 

S. Coleman, Aspects of Symmetry, Cambridge University Press, Cambridge, 1985. 

J. Collins, Renormalization, Cambridge University Press, Cambridge, 1985. 

E. D. Commins, Weak Interactions, McGraw-Hill, New York, 1973. 

E. D. Commins and P. H. Bucksbaum, Weak Interactions of Leptons and Quarks, Cambridge University 
Press, Cambridge, 2000. 

M. Creutz, Quarks, Gluons and Lattices, Cambridge University Press, Cambridge, 1983. 

P. A. M. Dirac, The Principles of Quantum Mechanics, Oxford University Press, Oxford, 1935. (On 
p. 253 he explained why he wanted the equation of motion for the electron to be first order in time 
derivative.) 

A. Dobado et al., Effective Lagrangiansfor the Standard Model, Springer-Verlag, Berlin, 1997. 

O.J.P. Eboli et al., Particle Physics, World Scientific, Singapore 1992. 

R. P. Feynman, Statistical Mechanics, Perseus Publishing, Reading, MA, 1998. 

R. P. Feynman and A. R. Hibbs, Quantum Mechanics and Path Integrals, McGraw-Hill, New York, 
1965. 

J. M. Figueroa-O’ Farrill, Electromagnetic Duality for Children, on the World Wide Web 1998. 

V. Fitch et al., eds., Critical Problems in Physics, Princeton University Press, Princeton, 1997. 

M. Gell-Mann and Y. Ne’eman, The Eightfold Way, W. A. Benjamin, New York, 1964. 

H. B. Geyer, ed., Field Theory, Topology and Condensed Matter Physics, Springer, 1995 (A. Zee, 
“Quantum Hall Fluids.”) 

M. L. Goldberger and K. M. Watson, Collision Theory, Dover, New York, 2004. 

N. Goldenfeld, Lectures on Phase Transitions and the Renormalization Group, Addison-Wesley, Read¬ 
ing, MA, 1992. 

F. Guerra and N. Robotti, Ettore Majorana: Aspects of His Scientific and Academic Activity, Springer, 
New York, 2008. 

C. Itzykson and J-M. Drouffe, Statistical Field Theory, Cambridge University Press, Cambridge, 1989. 

S. Iyanaga and Y. Kawada, eds., Encyclopedic Dictionary of Mathematics, MIT Press, Cambridge, 1980. 

L. Kadanoff, Statistical Physics, World Scientific, Singapore, 2000. 

G. Kane and M. Shifman, eds., The Supersymmetric World: The Beginning of the Theory, World 
Scientific, Singapore, 2000. 

J. I. Kapusta, Finite-Temperature Field Theory, Cambridge University Press, Cambridge, 1989. 

L. D. Landau and E. M. Lifschitz, Statistical Physics, Addison-Wesley, Reading, MA, 1974. 

S. K. Ma, Modem Theory of Critical Phenomena, Benjamin/Cummings, Reading, MA, 1976. 
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H. J. W. Muller-Kirsten and A. Wiedemann, Supersymmetry, World Scientific, Singapore 1987. 

T. Muta, Foundations of Quantum Chromodynamics, World Scientific, Singapore, 1998. 

D. I. Olive and P. C. West, eds., Duality and Supersymmetric Theories, Cambridge University Press, 
Cambridge, 1999. 

J. Polchinski, String Theory, Cambridge University Press, Cambridge, 1998. 

J. J. Sakurai, Invariance Principles and Elementary Particles, Princeton University Press, Princeton, 
1964. 

L. Schulman, Techniques and Applications of Path Integrals, John Wiley & Sons, New York, 1981. 

R. F. Streater and A. S. Wightman, PCT, Spin Statistics, and All That, W. B. Benjamin, New York, 
1968. 

G. ’t Hooft, Under the Spell of the Gauge Principle, Word Scientific, Singapore, 1994. 

G. ’t Hooft et al., eds. Recent Developments in Gauge Theories, Plenum, New York, 1980. 

D. Voiculescu, ed., Free Probability Theory, American Mathematical Society, Providence, R.I., 1997. 

S. Weinberg, Gravitation and Cosmology, John Wiley & Sons, New York, 1972. 

C. N. Yang, Selected Papers 1945-1980 with Commentary, W. H. Freeman, San Francisco, 1983. 

A. Zee, Unity of Forces in the Universe, World Scientific, Singapore, 1982. 

J.-B. Zuber, ed., Mathematical Beauty of Physics, World Scientific, Singapore, 1997. 


Some popular books and books on the history of quantum field theory 

M. Bartusiak, Einstein's Unfinished Symphony, Joseph Henry Press, Washington, D.C., 2000. 

I. Duck and E. C. G. Sudarshan, Pauli and the Spin-Statistics Theorem, World Scientific, Singapore 
1997. 

R. P. Feynman, QED: The Strange Theory of Light and Matter, Princeton University Press, Princeton, 
2006. 

D. Kaiser, Drawing Theories Apart, University of Chicago Press, Chicago, 2005. 

A. I. Miller, Early Quantum Electrodynamics, Cambridge University Press, Cambridge, 1994. 

L. O’Raifeartaigh, The Dawning of Gauge Theory, Princeton University Press, Princeton, 1997. 

S. S. Schweber, QED and the Men Who Made It: Dyson, Feynman, Schwinger, and Tomonaga, Princeton 
University Press, Princeton, 1994. 

A. Zee, Fearful Symmetry, Princeton University Press, Princeton, 1999. 

-, Einstein's Universe, Oxford University Press, New York, 2001. 

-, Swallowing Clouds, University of Washington Press, Seattle, 2002. 


Further Reading for Part N 

In writing a textbook, the author has the luxury of not preparing a detailed scholarly bibliography 
(unless he or she chooses to follow the example of S. Weinberg, who is, in my opinion, most 
admirable in this regard). Even more extravagant is the freedom accorded to authors of popular 
books who in most cases give their unsuspecting and gullible readers the impression that the physics 
of an entire era was done by two or three greats, individuals worthy of their own personality cults. 
Presenting recent developments still in flux, I am faced with the dilemma of whether to give proper 
credit. In scholarly publications, conscientious referencing is of course ethically mandated, but this 
is a textbook. Fortunately, in this age of omniscient search engines, the reader could easily compile 
a bibliography more exhaustive than even a myopic humanist used to be able to muster in half a 
lifetime. I could do the same, but it is of little help to you for me to merely list the names of those 
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responsible for, say, the new way of computing amplitudes using the spinor helicity formalism. 1 
Instead, I can best serve the typical reader by listing a few papers and review articles starting from 
which you can track down the literature to your scholarly heart’s desire. To those who feel that they 
should be mentioned, I apologize and refer you to Glashow’s description of a tapestry in the preface. 

W. Goldberger and I. Z. Rothstein, arXiv: hep-th/0409156v2. 

Z. Bern, L. J. Dixon, D. C. Dunbar, D. A. Kosower, arXiv: hep-ph/9602280. 

N. Arkani-Hamed and J. Kaplan, arXiv: hep-th/0801.2385. 

E. Witten, arXiv: hep-th/0312171. 


1 F. A. Berends, Z. Bern, L. Chang, P. De Causmaecker, L. J. Dixon, D. C. Dunbar, R. Gastmans, W. Giele, 
J. F. Gunion, R. Kleiss, D. A. Kosower, Z. Kunszt, M. Mangano, A. G. Morgan, S. J. Parke, W. J. Stirling, T. R. 
Taylor, W. Troost, T. T. Wu, Z. Xu, D. H. Zhang, and many many others. I knowhow to copy and paste also! Please 
forgive me if I inadvertently left you off this list. 



Index 


Page numbers followed by letters f and n refer to figures and notes, respectively. 


Abrahams, Elihu, 366 
accelerators, 42, 483 
Adler, Steve, 277 

Aharonov-Bohm effect, 251-252, 261, 317, 320 
amplitudes: as analytic functions, 208-209; 

symmetry in, 78. See also meson-meson scattering 
amplitude 

“amputating the external legs,” 55 
analyticity, in quantum field theory, 207-209, 211, 
217, 219. See also nonanalyticity 
Anderson, Phil: in “Gang of Four,” 366; Nobel Prize 
for, 351 

Anderson localization, 351, 354; in renormalization 
group language, 366-367 
Anderson mechanism, 264 
angular momentum, addition of, 530 
anharmonicity, in field theory, 43, 89 
anomaly (axial anomaly/chiral anomaly), 270, 275; 
alternative ways of deriving, 279; consequences of, 
275-278; Feynman diagram calculation revealing, 
270-274; grand unification and freedom from, 
411-412, 429; higher-order quantum fluctuations 
and, 310; in nonabelian gauge theory, 276; 
nonrenormalization of, 277; and path integral 
formalism, 278 
anthropic selection, 451 

anticommutation: spin-statistics connection and, 
122, 123; wave function of electrons and, 107 
antiferromagnet(s): effective low energy description 


of, 344-345; magnetic moments in, 344; Neel 
state for, 346 
antikink(s), 304, 305f 

antimatter: discovery of, 101; requirement of, 157 
antineutrino field, in SO (10) unification, 425-426 
antiunitary operator, time reversal as, 103 
anyon(s), 315; interchanging, 316; statistics between, 
317 

approximation, steepest-descent, 16 
area law, 377, 387 

asymptotically free theories, 360, 386, 390; 

Gross-Neveu model as, 403 
atom(s), interaction with radiation, 3 
attraction: quantum field theory on, 35-36; spin 1 
particle and, 36-37; spin 2 particle and, 36 
auxiliary field, 192, 467-468 
axial anomaly. See anomaly 
axial current conservation, quantum fluctuations 
destroying, 274-275 
axial gauge, 378 

background field method, 504-507 
Bardeen, Bill, 277 
bare perturbation theory, 175 
baryon number conservation, law of, 413; grand 
unification and violation of, 418 
BCFW recursion, 500, 507, 514 
Berends, F. A., 493 
Bern, Zvi, 484, 484f, 516, 519 
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Bern transformation, 516 
Berry’s phase, nonabelian, 261, 346 
Berry’s phase term, in ferromagnets and 
antiferromagnets, 345, 346 
beta decay, 456 
Bethe, Hans, 365 
Bianchi identity, 247, 261 
Bjorken, James, 356 

black hole(s): gravitational waves in, 479-482; 
Hawking radiation from, 290-291; Schwarzschild, 
311 

Bludman, Sid, Yang-Mills theory and, 379 
blue sky, effective field theory of, 457-458 
Bogoliubov, N. N., 192 

Bogoliubov calculation, of gapless mode, 284 
Bogomol’nyi inequality, 305; for mass of monopole, 
309 

Bogomol’nyi-Prasad-Sommerfeld (BPS) states, 309 
Bohr, Niels, 252 

Boltzmann, Ludwig, 150, 287; on entropy, 311 
Bose-Einstein condensation, 295 
Bose-Einstein statistics, 120 
Bose field, mass correction to, divergence of, 180 
boson(s): “bad” behavior of, 180; electron pairing 
into, 295; and fermions, unification of, 461; 
gapless mode in, 284; gauge (see gauge boson[s]); 
intermediate vector, and Fermi theory of the 
weak interaction, 171-172, 309; Lorentz invariant 
scalar field theory on, 190; mass correction for, 
divergence of, 180; massless, emergence of, 226- 
227; Nambu-Goldstone (see Nambu-Goldstone 
boson[s]); nonabelian gauge (Yang-Mills), 257, 
386, 434; in nonrelativistic theory, 192-193; 
repulsion of, 283, 337 

BPS (Bogomol’nyi-Prasad-Sommerfeld) states, 309 

brane world scenarios, 40-42, 450 

Brezin, Edouard, 402 

Brillouin zone, 298 

Brink, Lars, 470 

Britto, Ruth, 500 

Burgoyne, N., 121 

Cachazo, F. A., 500 

canonical formalism, 61-69; and degrees of freedom, 
67; and Feynman diagrams, 43; vs. path integral 
formalism, 44, 61, 67; propagator in, 67-68; time 
ordering in, 67-68 
Carrasco, J. J., 519 

Casimir force, between two plates, 70-75, 71f 
Cauchy’s theorem, 209 

central identity of quantum field theory, 182, 524 
chain rule, 445 

charge: in dual theory, vortices as, 332-334; as 


generator, 80; of quasiparticles, 327. See also 
electric charge; magnetic charge 
charge conjugation, 101-102; in grand unification, 
429 

charge quantization, deducing, 121 
Chern-Simons term, 317; gauge invariance of, 328; 
for Hall fluid, 326; massive Dirac fermions and, 
319-320; and Maxwell term, 320; in nonrelativistic 
theory, 320 

Chern-Simons theory, 318-319; effective theory of 
Hall fluid as, 324 
Chew, Geoff, 105 
chiral anomaly. See anomaly 
chiral superfield, 464, 466 
chiral symmetry: condition for, 419; conserved 
current associated with, 100; of strong interaction, 
234, 387-388 

classical limit, path integral formalism for taking, 19 
classical physics, symmetry of, 270 
Clifford algebra: and Dirac bilinears, 97; and Dirac 
equation, 94-95 
closed forms, 247 
coherence length, 296 
Coleman, Sidney, 252, 473 
Coleman-Mermin-Wagner theorem, 230 
Coleman-Weinberg effective potential, 240 
color, quark, 385, 386 
complex plane, 207, 208 
Compton, Arthur, 154 
Compton scattering, 152-157 
condensation, and superconductivity, 295 
condensed matter physics: critical dimension in, 
364, 367; disordered systems studied by, 350, 354; 
goal of, 328; Goldstone’s theorem in, 229-230; 
impurities studied by, 354; length scales in, 169; 
momentum density in, 191; number conjugate 
to phase angle in, 192; particle physics and, 281, 
452-453; quantum field theory and, 5, 190, 281; 
quantum Hall effect and, 351-352; quasiparticles 
in, 326; renormalization group in, 360-363; 
spin-statistics rule and, 120 
conductivity, vs. conductance, 366-367 
connected graphs, vs. disconnected graphs, 29, 47 
conserved current: charge associated with, 80; and 
chiral symmetry, 100; and continuous symmetry, 
78-79; momentum space version of, 133 
continuous symmetry, 77-78, 226; conserved current 
and, 78-79 

continuous symmetry breaking, 226; Coleman- 
Mermin-Wagner theorem on, 230; and massless 
fields, 228-229 
Cooper pairs of electrons, 295 
coordinate transformations, 81-82 
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cosmic coincidence problem, 450 
cosmological constant, 448-449; measured in units 
of Gev 4 , 449-450; order of magnitude, 449 
cosmological constant problem, 450; approach 
to, 455; root of, 448; string theory’s inability to 
resolve, 450; supersymmetry as solution to, 461 
Coulomb potential, 133-134, 143; modification of, 
205 

Coulomb’s electric force: and Newton’s gravitational 
force, comparison of, 29; quantum field theory on, 
32-33 

counterterms: cutoff dependence absorbed by, 

241; in Feynman diagrams, 175-176, 176f; 
nonrenormalizable theories and, 179, 241 
coupling, electromagnetic, 358 
coupling constant(s), 164, 173; dimensionless, 

170; of electromagnetic interaction, 359; hadron 
proliferation and, 231; as misnomer, 358-359; 
pion-nucleon, 235; renormalized, 166; Yang-Mills, 
258-259 

coupling renormalization, 173-174 
CPT theorem, 104 

critical dimension, in condensed matter physics, 
364, 367 

critical phenomena, 292; complete theory of, 293; 

Landau-Ginzburg theory of, 293-294 
crossing, 156 

cubic vertex, in spinor helicity formalism, 496 
current conservation. See conserved current 
curved spacetime: Dirac action in, 445; introduction 
to, 84-86; quantum field theory in, 82, 290 
Cutkosky cutting rule, 215-216, 217, 219, 493 
cutoff dependence, 163; avoiding in physical 

perturbation theory, 176; counterterms absorbing, 
241; disappearance of, 166, 167; of meson-meson 
scattering amplitude, 173 

dark energy, 450 
dark matter, 450 
Dashen, Roger, 405 
decay rate, 139-141, 212 
Deser, Stanley, 470 

differential forms, 246-247; use in nonabelian gauge 
theory, 255, 256 

differential operator, propagator as inverse of, 23 
dilatation invariance, 84n 
dimensional analysis, 169-170, 453; on meson- 
meson scattering amplitude, 173 
dimensional regularization, 167, 168, 204 
Dirac, Paul, 105; on electric charge, quantizing of, 
410; on magnetic monopole, 308; metaphorical 
language of, 113; on path integral formalism, 
10-13; on positron, 5; on quantum mechanics 


and magnetic monopoles, 245; on spinor 
representation, 117; teaching style of, 454 
Dirac basis vs. Weyl basis, 98-99 
Dirac bilinears, 97 

Dirac equation, 93-105; Clifford algebra and, 94- 
95; in curved spacetime, 444; and degrees of 
freedom, reduction in, 95; derivation of, 118; 
electromagnetic field and, 101; handedness and, 
100; Lorentz transformation and, 96-97; magnetic 
moment of electron in, 194-195; origins of, 93- 
94; parity and, 98; in solid state physics, 298, 299; 
solving, 98; time reversal and, 104 
Dirac field: interacting with scalar field, Feynman 
rules for, 53-54, 534-535; interacting with vector 
field, 100; interacting with vector field, Feynman 
rules for, 129, 129f, 535-536; propagator for, 

127; quantizing, 107-113, 122; quantizing by 
Grassmann path integral, 127; vacuum energy of, 
111-112, 125 
Dirac operator, 113 

Dirac spinor, 94, 96; components of, 117; and 
supersymmetry, 114 

disorder: Anderson localization of, 351, 354; 
condensed matter physics and study of, 350, 354; 
Grassmannian approach to, 354 
dispersion relations, 208-210, 217-218, 235 
Di Vecchia, Paolo, 470 

divergence(s): degree of, 176-178; dependence on 
dimension of spacetime, 179; with fermions, 178- 
179; logarithmic, 175, 176-177; in quantum field 
theory, 57-58, 161-162; superficial degree of, 

176, 179; total, supersymmetric transformation, 
465-466 

dotted and undotted notation, 116-117, 541-544; 
replacing, 475 

double-line formalism, 395, 396 
double-slit experiment, 7, 8f; expansion of, 7-9, 8f, 9f 
double-well potential, 224, 224f 
Drell, Sid, 356 

duality: in (2 + 1)-dimensional spacetime, 335; 
concept of, 331, 332; electromagnetic, 249; and 
linking of perturbative weak coupling to strong 
coupling, 473; of monopoles, 334; nonrelativistic 
treatment of, 336-337; relativistic treatment of, 
335-336; of string theories, 334; vortex, 334 
dual theory, vortices as charges in, 332-334 
dynamical symmetry breaking, 230; example of, 388 
dynamical variable, in quantum field theory, 19 
dyon, 309 

Dyson, Freeman, 60 
Dyson gas approach, 400-402 

effective field theory: of blue sky, 457-458; 
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effective field theory (continued) 

development of, 452; Fermi theory of the weak 
interaction as, 456; gravitational waves and, 479- 
482; of Hall fluid, 324-325, 452-453; of neutrino 
masses, 456; predictive power of, 456-459; of 
proton decay, 455-457; recent developments in, 
479-482; and renormalization group flow, 453; 
reshuffling terms in, 458-459 
effective potential, 238-239; Coleman-Weinberg, 

240; generated by quantum fluctuations, 243 
Ehrenfest, P., 441 

Einstein, Albert: and cosmological constant, 449; and 
repeated indices summation convention, 475n 
Einstein-Hilbert action for gravity, 433-434; 
Newtonian gravity derived from, 438; Yang-Mills 
action compared with, 434-435 
Einstein Lagrangian, linearized, 34 
Einstein’s theory of gravity, 81, 83; and deflection of 
light, 439-440; gravitational waves in, 479-481; 
nonrenormalizability of, 172; Yang-Mills theory 
compared with, 444-445, 513-520 
electric charge: quantized, grand unification on, 

410; quantum fluctuations and, 204, 205; 
renormalization of, 205 

electric force: and gravitational force, comparison of, 
29; quantum field theory on, 32-33 
electromagnetic coupling, flow of, 358 
electromagnetic duality, 249 

electromagnetic field: Dirac equation in presence of, 
101; as quantum field, 3-4; stress-energy tensor 
of, 83-84 

electromagnetic force: between like charges, 33; 
knowledge of, 448 

electromagnetic wave, degrees of freedom of, 38 
electromagnetism: Faddeev-Popov method applied to, 
185-186; and gravity, unification of, 442; Maxwell 
on (see Maxwell theory of electromagnetism); 
weakness of, 414-415 

electron(s): absolute identity of, 120-121, 134; binary 
strings in, 428; Bose-Einstein statistics for, 120; 
in condensed matter system, 281; Cooper pairs 
of, 295; degrees of freedom of, 95, 99; effect 
of magnetic field on, 251-252; energy levels 
available to, 5; Fermi-Dirac statistics for, 120; 
as fermions, 322; fractional Hall state of, 324; 
magnetic moment of (see magnetic moment 
of electron); mass of, in classical physics, 180; 
noninteractive hopping, 298-299, 298f, 299f; 
pairing into bosons, 295; photon fluctuation into, 
200-202, 201f; photon scattering on, 152-157, 
153f, 157f; requirements of antisymmetric wave 
function, 107; stability of, 413 
electron-positron annihilation, 155-156, 389-391 
electron scattering, 132-143; cross sections for, 137— 


143; off electrons, 134-138, 135f; off nucleons, 
deep inelastic, 386; off protons, 132-134, 133f, 
199; off protons, deep inelastic, 359; off protons, 
Schrodinger equation for, 3; to order e 4 , 145-149, 
145-147f, 148f; potential, 133-134, 134f 
electro weak theory, 170-171, 379; construction of, 
379-383; renormalizability of, 384 
energy: dark, 450; fundamental definition of, 83; 
of mass, 35; quantum mechanics and special 
relativity on, 3; of vacuum (see vacuum energy) 
energy density, 35 
energy-momentum tensor, 319 
energy scales: in particle physics, 169; 

renormalization group and, 361 
entropy, Boltzmann on, 311 
Euclidean path integral, 12 

Euclidean quantum field theory, 287-288; and high- 
temperature quantum statistical mechanics, 289; 
and quantum statistical mechanics, 289 
Euler, Leonhard, 460 
Euler-Lagrange equation, 12, 80, 438, 448 
exact forms, 247 

Fadeev-Popov method, 183-185, 267, 371; applying 
to electromagnetism, 185-186; and derivation of 
graviton propagator, 437 
Feng, Bo, 500 
Fermi, Enrico, 105,137 
Fermi coupling, 170 
Fermi-Dirac statistics, 120 

Fermi field, mass correction to, divergence of, 180 
Fermi liquid, gapless modes in, 285 
fermion(s): and bosons, unification of, 461; degree 
of divergence with, 178-179; electrons as, 322; 
Feynman rules for, 128-131, 128f; in lattice gauge 
theory, 376; mass correction for, divergence of, 
180; massive Dirac, and Chern-Simons term, 
319-320 

fermion-fermion scattering, Feynman diagram for, 
172,172f 

fermion masses: in grand unification, 417-418; 

naturally small, 419 
fermion normalization factors, 134 
fermion propagator, 112 

Fermi theory of the weak interaction, 232; as effective 
field theory, 456; intermediate vector boson and, 
171-172; nonrenormalizability of, 170, 179, 384; 
predictive power of, 453; within electroweak 
theory, 171 

ferromagnet(s), 229; effective low energy description 
of, 344-345; low energy modes in, 345-346; 
magnetic moments in, 344; order in, 328 
ferromagnetic transition, 295 
Feynman, Richard: contribution of, 43; on difficulty 
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of quantum electrodynamics, 61; on Dirac, 105; 
metaphorical language of, 113; on path integral 
formalism, 7-10; study of calculus by, 522; on 
trace products of gamma matrices, 137; Yang-Mills 
theory and, 371 

Feynman diagrams: beginning of, 29, 30f; breaking 
shackles of, 311; canonical formalism and, 43; 
childish game generating, 53, 53f; connected 
vs. disconnected, 29, 47; counterterms in, 175— 
176, 176f; Cutkosky cutting rule for, 215-216, 
217, 219; discovering, 43-51, 45f, 46f; dominance 
of, 302; for electron scattering, 132-134, 133f, 
134f, 135f; evaluating, 538-539; for fermion- 
fermion scattering, 172, 172f; finite temperature, 
289; function of, 50; imaginary part of, 207-219, 
208f, 213f; limitations of, 67; loop, 45, 57-58, 

57f, 58f, 181, 494; in momentum space, 54; new 
approaches to, 483-486; orientation of, 54; path 
integral formalism and, 44; in perturbation theory, 
55, 56f; for photon scattering, 152, 153f, 155, 155f; 
regularization of, alternative ways of, 166; relating 
infinite sets of, 234-235; in spacetime, 54, 58, 213 
Feynman gauge, 149 

Feynman rules, 534-537; colored, 485, 491, 495; 
discovery of, 60; for fermions, 128-131, 128f; in 
nonabelian gauge theory, 536-537; in physical 
perturbation theory, 175-176, 176f; for quantum 
electrodynamics, derivation of, 144-150; in 
random matrix theory, 397, 398f; for scalar field, 
54-55, 534-535; in spontaneously broken gauge 
theories, 266-267; for vector field, 129, 13Of, 
535-536; in Yang-Mills theory, 257, 257f, 494-495 
field redefinition, 68-69, 218, 342 
field renormalization, 175 
field strength, construction of, 255-257 
Fierz, M., 121 
Fierz identities, 459 
Fisher, Matthew P. A., 336n 
Fisher, Michael, 293; and renormalization groups, 
361 

fixed point(s), strong coupling, 359 
flux: fundamental unit of, 324; gauge potential and, 
334 

force: origin of, 29; particle and, 27-29. See also 
specific force 

forms: closed vs. exact, 247-248; geometric character 
of, 250-251, 250f 
fractional Hall effect, 323-324 
fractional (anyon) statistics, 315; coupling to 
gauge potential, 316-317; gauge boson and, 320; 
misleading nature of term, 317; and quasiparticles, 
327 

freedom, degrees of, 37-38; canonical formalism 
and, 67; Dirac equation and reduction in, 95; of 


electron, 95, 99; gauge invariance as redundancy 
in, 268; longitudinal, in massive gauge field, 264; 
of photon, 186-187 

free field theory (Gaussian theory), 21-23, 43; in 
terms of Fourier transform, 26 
Fujikawa, Kazuo, 278 

gamma matrices, 94, 117, 538; products of, 95- 
96; trace products of, evaluating, 136-137, 
153-154 

Gamow, George, 120n 
“Gang of Four,” 366 

gapless mode, 284; Bogoliubov calculation of, 284; 

linearly dispersing, 284-285 
gauge boson(s): and fractional statistics, 320; and 
intermediate vector boson, 309; mass spectrum 
of, 266 

gauge fixing, 183 

gauge invariance, 83n, 144, 475; of Chern-Simons 
term, 328; and Dirac quantization of magnetic 
charge, 248; discovery of, 144n; in lattice 
gauge theory, 376; in nonabelian gauge theory, 
preserving, 204; origin of, 183; proof of, 145-150, 
203-204; as redundancy in degrees of freedom, 
268; regularization respecting, 202-204; and 
renormalizability, 411 

gauge potential, 251; flux associated with, 334; 
fractional statistics and, 316-317; in Hall fluid, 
325-326, 329; nonabelian, 254, 255 
gauge theory(ies): Faddeev-Popov quantization of, 
183-185, 267; and fiber bundles, correspondence 
between, 256; gravity, as, 436; lattice, 374-376; 
recent developments in, 497-512; redundancy 
in, 183-185, 189; 5-matrix theory and, 498- 
501; spontaneously broken, Feynman rules 
for, 266-267; spontaneously broken, magnetic 
monopoles in, 309; and superconductivity 
theory, 296; symmetry breaking in, 263-265, 

268, 296; unsatisfactory formulation of, 474, 

497; vortex in, 307. See also nonabelian gauge 
theory(ies) 

gauge transformation (local transformation), 187, 
254; and general coordinate transformation, 

443 

Gauss-Bonnet theorem, 457, 459 
Gaussian integration, 14, 523 
Gaussian theory (free field theory), 21-23, 43; in 
terms of Fourier transform, 26 
Gell-Mann, Murray: and effective field theory, 460; 
on quark color, 385; and seesaw mechanism, 426; 
<r model of, 340-341; SU (3) of, 531; Yang-Mills 
theory and, 371 
Gell-Mann matrices, 265 
general coordinate invariance, 36 
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general coordinate transformation, and gauge 
transformation, connection between, 443 
general covariance, principle of, 81 
general relativity: finite size objects in, 480-482; and 
quantum mechanics marriage of, 6; review of, 
84-86 

generator, charge as, 80 

Georgi, Howard, 42In; grand unification theory of, 
407-409 

ghost fields, 372-374 
Giele, W., 493 

Ginzburg, V., 264; on London penetration length, 
296; on second-order transitions, 292; on 
superconductivity, 295 
Girvin, Steve, 328 

Glashow, Sheldon, 171; electroweak theory of, 383; 
grand unification theory of, 407-409; and seesaw 
mechanism, 426; Yang-Mills theory and, 379 
gluon(s), 386; origins of concept, 235 
gluon scattering, 483-496; approaches to calculation 
of, 483-493, 484f, 491f; spinor helicity formalism 
and, 486-491, 496 

Goldberger, Murph L., 105, 137, 460 
Goldberger-Treiman relation, 235, 342 
Goldstone’s theorem, 228-229; in condensed matter 
physics, 229 
Golfand, Yu. A., 461 
Gordon decomposition, 195 
Goto, T., 469 

grand unification, 452; binary code in, 426- 
428; charge conjugation in, 429; and deeper 
understanding of physics, 410-411; fermion 
masses in, 417-418; and freedom from anomaly, 
411-412, 429; and hierarchy problem, 419; 
need for, 407; and origin of matter, explanation 
for, 418; and proton decay, 413-414, 415, 

456; SO (10): antineutrino field in, 425-426; 

SO (18), 428; spinor representation of, 421- 
423, 424, 426; SU (5), 531; SU (5), Georgi 
and Glashow theory of, 407-409; triumph of, 
415-416 
Grant, A. K., 516 
Grassmannian symmetry, 355 
Grassmann integration, 126-127 
Grassmann number(s), 123, 126; in path integral for 
spinor field, 124 
Grassmann variables, 246 
gravitational force. See gravity 
gravitational interaction, 36 
gravitational waves, and effective field theory, 
479-482 

gravition propagator, 437-438 
graviton: coupling to matter, 435; definition of, 83; 
deformed polarizations of, 514-515; as elementary 


particle, 434, 448; force associated with, 29; in 
(n + 3 + 1)-dimensional universe, 41-42; recent 
developments on, 513-520; self-interaction of, 
434; in spacetime, 515-517; spin of, 35, 39; in 
string theory, 513-514, 516 
gravity: Einstein-Hilbert action for, 433-434; 
Einstein on (see Einstein’s theory of gravity); 
and electromagnetism, unification of, 442; 
as field theory, 434-436; as gauge theory, 

436; helicity structure of, 446; of light, 441; 
Newton on (see Newton’s gravitational force); 
nonrenormalizability of, 434; weak field action 
for, 436-437 

Green’s function(s), 47, 55, 352; generating, 50; 

propagator related to, 23 
Gross, David, 386 
Gross-Neveu model, 402-404 
ground state, in quantum field theory, 37, 225 
ground state degeneracy, 319 
group theory, review of, 525-533. See also special 
orthogonal group SO (N); special unitary group 
SU (N) 

hadron(s): electron-positron annihilation into, 389- 
391; in electroweak unification, 383; experimental 
observation of, 231; quarks as components of, 385 
Hall effect, 351-352; fractional, 323-324; integer, 323 
Hall fluid(s), 322-330; Chern-Simons term for, 

326; effective field theory of, 324-325, 452- 
453; electron tunneling in, 329; five general 
statements/principles of, 325, 329; gauge potential 
in, 325-326, 329; incompressibility of, 323, 328; 
Laughlin odd-denominator, 327; order in, 328 
handedness, field, 100; charge conjugation and, 101 
Hans son, T., 324 
harmonic paradigm, 5 
Hasslacher, Brosl, 405 
Hawking radiation, 290-291 
Heaviside, O., 24, 245 
hedgehog, 308 

Heisenberg, Werner: approach to quantum 

mechanics, 61-62; and effective field theory, 460; 
isospin SU (2) of, 531; isospin symmetry of, 387, 
388; on neutron and proton, symmetry of, 77 
helicity, topological quantization of, 532-533 
helicity formalism, spinor, 486-491, 496, 501, 521 
hierarchy problem, grand unification and, 419 
Higgs field, covariant derivative of, 266 
Higgs particle, mass of, 384 
high energy physics: renormalization group in, 
359-360 

high frequency behavior, 208-210 
Hofstadter, R., 199 
homotopy groups, 307 



Hopf term, 318; non-local, 329 
Howe, P., 470 

Hubbard-Stratonovich transformation, 192 

identity, absolute, 120-121, 134 
imaginary part, of Feynman diagrams, 207-219, 
208f, 213f 

impurities, 323; condensed matter physics and study 
of, 354; and random potential, 350 
infinities, in quantum field theory, 161-162 
instanton(s), 309; discovery of, 473 
integer Hall effect, 323 

integration measure, in path integral formalism, 67 

integration variables, shifting, 272 

interchange symmetry, 77 

internal symmetry, 77 

inverse square law, 40 

irrelevant operators, 363 

Ising model, 361 

isospin symmetry of Heisenberg, 387, 388 
Itzykson, Claude, 402 
Iwasaki, Y., 439 

Johansson, H., 519 
Jona-Lasinio, G., 237 
Jordan, P., 107 

Josephson junction, fundamental relation 
underlying, 192 

Kadanoff, Leo, 361; and renormalization groups, 

361 

Kaluza-Klein compactification, 442-443; derivation 
of, 447 

Kardar-Parisi-Zhang equation, 347 
Kawai, H., 513 
kinks. See solitons 
Kivelson, Steve, 324 

Klein-Gordon equation, 21, 93, 95, 190; Schrodinger 
equation derived from, 190 
Klein-Gordon operator, 113 
Klein-Nishina formula, 155 
Kockel, B., 460 

Kosterlitz-Thouless transition, 310 
Kramer’s degeneracy, 103 

Lagrangian: Dirac (see Dirac equation); gauge 
invariant, 253-254; Maxwell (see Maxwell 
Lagrangian); Meissner, 332, 335; as mnemonic, 
340, 342; for quantum electrodynamics, 101, 144; 
symmetries of breaking, 223; weak interaction, 
100; Yang-Mills, 257 
Lamb shift in atomic spectroscopy, 205 
Landau, L. D., 264; on complex momenta, 498; on 
London penetration length, 296; on second-order 


transitions, 292; on superconductivity, 295; on 
superfluidity, 284 
Landau gauge, 149 

Landau-Ginzburg approach to quantum field theory, 
18 

Landau-Ginzburg theory (mean field theory), 
292-294; order in, 328 
Landau levels, 323 
Laplace, P.-S., 290 
Large Hadron Collider, 483 

large N expansion, 394-396; Dyson gas approach to, 
400-402; field theories in, 402-404 
Larmor circle, 322, 323 

lattice gauge theory, 374-376; Wilson loop in, 
376-377, 457 

Laughlin odd-denominator Hall fluids, 327 

Lee, B., 173 

Lee, D. H., 336n 

Lee, Tsung-dao, 100 

Legendre transform, 238-239 

Leinaas, J., 315 

length scales; in condensed matter physics, 169; 

renormalization group and, 361-362 
leptons: families of, 384; generations of, 428; and 
quarks, neutral current interaction between, 383 
Levy, M., o model of, 340-341 
Lewellen, David C., 513 
Licciardello, D., 366 
light, gravity of, 441 
light beam, stress-energy tensor of, 445 
Likhtman, E. P., 461 

linearly dispersing mode, 284; velocity of, 285 
local field theory, 474, 521-522 
localization: Anderson, 351, 354; Anderson, in 
renormalization group language, 366-367; study 
of, 355 

local transformation (see gauge transformation) 
logarithmic divergence, 175, 176-177 
London penetration length, 296 
loop diagrams, 45, 57-58, 57f, 58f, 181, 494 
Lorentz algebra, 114-116 
Lorentz boosts, 114-115 

Lorentz group: defining representation of, 116; 
generators of, algebra for, 115-116; spinor 
representation of, 116-118 
Lorentz invariance, 475; canonical formalism and, 
63, 66-67; Euclidean equivalent of, 362; in 
quantum field theory, 18, 24; recent developments 
in, 507-510 

Lorentz transformation: and Dirac equation, 96-97 
Low, F., 460 
Liiders, G., 121 

MacDonald, Alan, 328 
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magnetic charge (monopole), 249, 309; confinement 
in superconductor, 386-387; Dirac quantization 
of, 248-249, 252; duality of, 334; electrically 
charged (dyon), 309; mass of, 309; and Maxwell’s 
equations, 249; quantum mechanics and, 245; in 
spontaneously broken gauge theory, 309 
magnetic moment of electron: anomaly in, 196; 
calculation of, 454; in Dirac equation, 194-195; 
Schwinger on, 196-198, 454 
magnetic moment of ferromagnet and 
antiferromagnet, 344 

magnetic moment of proton, anomaly in, 454-455 

Majorana, Ettore, 102n, 543 

Majorana equation, 102 

Majorana mass, 102; for neutrino, 102 

Majorana spinor, 102, 543 

Mandelstam variables, 137-138, 498 

marginal operators, 364 

mass(es): attraction between, 35-36; of electron, 

180; energy of, 35; of gauge boson, 266; of Higgs 
particle, 384; of magnetic charge (monopole), 309; 
Majorana, 102; of neutrino, 102; of nucleon, 341; 
Planck, 41-42, 434; of soliton (kink), 305 
massive gauge field, Nambu-Goldstone boson and, 
264-265 

massive spin 1 field, vs. massless spin 1 field, 183 
massive spin 1 particle: degrees of freedom of, 38; 
degrees of polarization of, 34; field theory of, 
32-33; propagator for, 34; in Yang-Mills theory, 
379 

massive spin 2 particle: degrees of polarization of, 
35; propagator for, 35, 439 
mass renormalization, 174 
matrix (matrices): gamma (see gamma matrices); 
Gell-Mann, 265; Pauli, 265; without inverse, 
182-183 

matter: dark, 450; origin of, explanation for, 418; 
states of, 328 

mattress model of scalar field theory, 4-5, 4f; 
disturbing, 21, 2 If; path integral description of, 
17-19 

Maxwell, James Clerk, 521 
Maxwell action, 182 

Maxwell equations, magnetic charges and, 249 
Maxwell Lagrangian, 32, 34, 84; bypassing, 33-34; 

derivation of, 38 
Maxwell term, 320, 329 

Maxwell theory of electromagnetism: development 
of, 474-476; Yang-Mills theory compared with, 
257 

mean field theory (Landau-Ginzburg theory), 
293-294; order in, 328 
Meissner effect, 296, 386 
Meissner Lagrangian, 332, 335 


meson(s): birth of, quantum field theory on, 55-56, 
56f; 7i (see pion[s]); o , 341, 342; soliton compared 
with, 304; vector, field theory of, 32-33. See also 
massive spin 1 particle 

meson-meson scattering amplitude, 357; canonical 
formalism and, 64-65; cutoff dependence of, 

173; dimensional analysis on, 173; divergence 
of, 161-162; path integral formulation of, 166; 
regularization and, 163; renormalization and, 
164-166 

Michell, John, 290 

Mills, Robert, and nonabelian gauge theory, 253, 

255 

Minkowski, Peter, and seesaw mechanism, 426 
Minkowskian path integral, 287 
Minkowskian spacetime, 36 
momentum: complex, 498-501, 499f; fundamental 
definition of, 83; orbital angular, Dirac equation 
on, 194-195; spin angular, Dirac equation on, 195; 
square root of, 486-489 

momentum density, in nonrelativistic theory, 191 
momentum space, 26; fermion propagator in, 113; 

Feynman diagrams in, 54 
monopole. See magnetic charge 
Montonen, J., 334 
muon, weak decay of, 380 
Myrheim, J., 315 

Nambu, Yoichiro, 297, 469; Nobel prize for, 228n 
Nambu-Goldstone boson(s), 228-229; gapless mode 
as, 284; in massive gauge field, 264-265; n 
mesons (pions) as, 234, 387, 388; in relativistic vs. 
nonrelativistic theories, 285 
naturalness, notion of, 419 
Neel state, for antiferromagnet, 346 
Ne’eman, Y., SU (3) of, 531 
neutral current interaction, 383 
neutrino(s): handedness of, 101; mass of, 102 
neutrino masses, effective field theory of, 456 
neutron(s): /3 decay of, 231-232; electric dipole 
moment for, 259; and proton, internal symmetry 
of, 77 

Neveu, Andre, 402, 405 

Newton’s gravitational force: and Coulomb’s electric 
force, comparison of, 29; derived from Einstein- 
Hilbert action, 438; quantum field theory on, 32, 
33-36 

Noether current, 191, 234 
Noether’s theorem, 78-79, 100, 341; elaborate 
formulation of, 80 
nonabelian Berry’s phase, 261, 346 
nonabelian gauge potential, 254, 255; coupling to a 
fermion field, 260 

nonabelian gauge theory(ies), 253-260; chiral 
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anomaly in, 276; differential forms in, 255, 

256; Feynman rules in, 536-537; gauge 
invariance in, preserving, 204; ghost action 
in, 372; redundancy of, Faddeev-Popov 
approach to, 183; renormalizability of, 173, 

411; strong interaction described by, 259, 379; 

't Hooft double-line formalism and, 258-259; 
unsatisfactory features of, 474. See also Yang-Mills 
theory 

nonanalyticity: emergence of, 292; symmetry 
breaking and, 293 
noncommutative field theory, 474 
nonrenormalizable theory(ies), 169, 179, 453; 
counterterms in, 179, 241; Einstein’s theory 
of gravity as, 172; Fermi’s theory of the weak 
interaction, 170, 179 
nonrenormalization of the anomaly, 277 
notation, dotted and undotted, 116-117, 541-544; 
replacing, 475 

nucleon(s): attraction between, 28; electron scattering 
off of, deep inelastic, 386; mass of, 341; and pions, 
interaction between, 340-341; wave function of 
quarks in, 385 

Olive, D. I., 334 
optical theorem, 215-216, 219 
orbital angular momentum, Dirac equation on, 
194-195 

order parameters, 295 

orthogonal groups, embedding unitary groups into, 
423-424 

Parisi, Giorgio, 402 

parity, 98; Dirac equation and, 98; and Dirac spinor, 
117; weak interaction and, 100, 379-380 
Parke, S. J., 493 

particle(s): birth and death of, 4-5; birth of, quantum 
field theory on, 55-56; field associated with, 26- 
27; force associated with, 27-29; interchanging, 
315-316, 316f; propagation of, describing, 48- 
49, 50; scattering of (see scattering of particles); 
sources and sinks for, 20. See also specific particles 
particle physics: and condensed matter physics, 281, 
452-453; energy scales in, 169; family problem in, 
428; spontaneous symmetry breaking in, 292, 297, 
449 

partition function, in quantum statistical mechanics, 
288-289 

path integral formalism: vs. canonical formalism, 

44, 61, 67; chiral anomaly and, 278; and classical 
limit, 19; derivation of, 44; description of mattress 
model, 17-19; Dirac on, 10-13; Feynman on, 7- 
10; Grassmann math and, 127; history of, 60; 
integration measure in, 67; replacing, 475; for 


spinor field, 124; and vacuum energy, calculation 
of, 123-125 

Pauli, Wolfgang, on spin-statistics connection, 121 
Pauli exclusion principle, 120, 323; history of, 120n 
Pauli-Flopf identity, 345 
Pauli matrices, 265 

Pauli-Villars regularization, 75, 166-168 
Peierls, Rudolf, 365 
Peierls instability, 300 
pentagon anomaly, 276, 277f 
perturbation theory, 49-51; bare, 175; Feynman 
diagrams in, 55, 56f; finite temperature, 289; 
physical (renormalized/dressed), 175-176,176f 
perturbative quantum gravity, 441 
(p 4 theory, renormalizability of, 173, 175 
phonon(s), 5, 284 

photon(s): absence of rest frame for, 186-189; birth 
and death of, 4; Bose-Einstein statistics for, 120; 
degrees of freedom of, 186-187; electron-positron 
annihilation into, 155; emission and absorption of, 
150; fluctuation into electron and positron, 200- 
202, 201f; force associated with, 29; longitudinal 
mode of, 150; spin of, 36, 39 
photon propagation: charge as measure of, 204; 

quantum fluctuations and, 200-202, 201f 
photon propagator, 149-150; Fourier transform of, 
205; physical (renormalized), 201, 201f 
photon scattering, 152-157; cross sections for, 
152-155; on electrons, 152-157, 153f, 157f 
physical perturbation theory, 175-176, 176f 
pion(s) (n meson): massless, 235, 341; Nambu- 
Goldstone boson, 388; as Nambu-Goldstone 
boson, 234, 387; and nucleons, interaction 
between, 340-341; prediction regarding, 29; 
quarks as components of, 385; weak decay of, 
231-233 

pion-nucleon coupling constant, 235 
Planck mass: modified, 434; for (n + 3 + 1)- 
dimensional universe, 41-42 
Planck’s constant, 181 
Podolsky, B., 441 
Poincare lemma, 247 

point particle: action of, constructing, 84-86; stress 
energy of, calculating, 86; world line traced out by, 
length of, 84, 85f 
Poisson equation, 438 
polarization, degrees of, 34 
Politzer, H. D., on Yang-Mills theory, 386 
Polyakov, Alexander (Sasha), 498; on magnetic 
monopoles, 309 
Polyakov action, 470 
Pontryagin index, 310 

positron(s): Dirac’s conception of, 5; photon 
fluctuation into, 200-202, 201f 
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potential energy, double-well, 224, 224f 
power counting theorem, 176-178 
preons, theories about, 278 
product rule, 445 

propagation of particles, describing, 48-49, 50 
propagator, 23-25; in canonical formalism, 67-68; 
for Dirac field, 127; fermion, 112; graviton, 437- 
438; for massive spin 1 particle, 34; for massive 
spin 2 particle, 35, 439; photon, 149-150 
proton(s): charge of, grand unification on, 410; 
electron scattering off of, 132-134, 133f, 199; 
electron scattering off of, deep inelastic, 359; 
electron scattering off of, Schrodinger equation 
for, 3; magnetic moment of, anomaly in, 454-455; 
and neutron, internal symmetry of, 77; quarks as 
components of, 385; stability of, 413 
proton decay: branching ratios for, 416-417; effective 
theory of, 455-457; grand unification and, 
413-414, 415, 456; slow rate of, 418 

quantum chromodynamics (QCD), 360, 386; analytic 
solution of, search for, 391; at high energies, 391; 
large N expansion of, 394-396; renormalization 
group flow of, 388-389 

quantum electrodynamics (QED), 32; coupling 
constant of, 164; coupling in, 358; electromagnetic 
gauge transformation in, 189; Feynman on 
difficulty of, 61; Feynman rules for, derivation 
of, 144-150; intellectual incompleteness of, 121; 
Lagrangian for, 101, 144; renormalizability of, 

173 

quantum field theory(ies): in (0 + 0)-dimensional 
spacetime, 397; in 2-dimensional spacetime, 

470; anharmonicity in, 43, 89; asymptotic 
behavior of, study of, 359-360; central identity 
of, 182, 523; and condensed matter physics, 

5, 190, 281; crisis of, 231, 340, 452; in curved 
spacetime, 82, 290; divergences in, 57-58, 161— 
162; Euclidean, 287-288, 289, 290; at finite 
density, 291; at finite temperature, 289-290; 
gravity as, 434-436; ground state in, 37, 225; 
harmonic paradigm and, 5; hidden structures 
in, 476; history of, 60; infinities in, 161-162; 
innovative applications of, 473-474, 476; integral 
of, 88-89; low energy manifestation of, 162, 169, 
452; mattress model and, 17-19; motivation 
for constructing, 55; need for, 3-5, 6, 123; 
nonrelativistic limit of, 190-191; relativistic 
vs. nonrelativistic, 191-193; renormalizable 
vs. nonrenormalizable, 169; on repulsion and 
attraction, 32-36; restrictions within, 474; steps 
toward, 235; strong and weak interactions 
applied to, 231; of strong interaction, 340; 
supersymmetric, 461, 467-468; surface growth 
and, 347-349; symmetry breaking in, 225-226; 


theories subsumed by, 473; threshold of ignorance 
in, 162-163, 453; triumph of, 452, 473; vacuum 
in, 20 

quantum fluctuations: axial current conservation 
destroyed by, 274-275; effective potential 
generated by, 243; and electric charge, 204, 205; 
first order in, 239-240; higher order, and chiral 
anomaly, 310; and photon propagation, 200-202, 
201f; and symmetry breaking, 229, 237, 242, 270 
quantum Hall fluid. See Hall fluid(s) 
quantum Hall system, 281 
quantum mechanics: antimatter as requirement 
in, 157; and general relativity, marriage of, 6; 
harmonic oscillator in, solving, 43; Heisenberg’s 
approach to, 61-62; and magnetic monopoles, 
245; partition function in, 288-289; path integral 
formalism of, 7-12; quantum field theory as 
generalization of, 88-89, 473; and relativistic 
physics, joining in spin-statistics connection, 

122; and special relativity, marriage of, 3, 6, 

121; symmetry breaking in, 225-226; symmetry 
of, 270; time reversal in, 102-104; and vector 
potential, need for, 245 
quantum statistics, 120 
quantum vacuum, 358 

quark(s): color of, 385, 386; confinement of, 377, 
386-387; in electroweak unification, 383; families 
of, 384; flavors of, 385; generations of, 428; and 
leptons, neutral current interaction between, 

383; origins of concept, 235; strong interaction 
between, weakening of, 360 
quasiparticle(s), 326; charge of, 327; fractional 
statistics and, 327; as vortex, 328 

radiation: and atoms, interaction between, 3; 

Hawking radiation, 290-291 
Ramakrishnan, T. V., 366 
Ramond, Pierre, and seesaw mechanism, 426 
random dynamics, and quantum physics, 349 
random matrix theory, 396-397; Feynman rules in, 
397, 398f 

random potential, impurities and, 350 
Rarita-Schwinger equations, 119 
Rayleigh, Lord, 458 

recursion, 501-503, 507-512, 521; BCFW, 500, 507, 
514 

redundancy, Faddeev-Popov approach to, 183-185 
reflection symmetry, 76, 226; breaking, 223, 224, 

225 

Regge, T., 498 

regularization, 163; Casimir force and, 71-75; 
dimensional, 167, 168, 204; gauge invariance 
respected by, 202-204; Pauli-Villars, 75,166-167 
relativistic physics: equations of motion in, unified 
view of, 95; language of, 26; and quantum physics, 



Index | 573 


joining in spin-statistics connection, 122. See also 
general relativity; special relativity 
relativistic quantum field theory: correctness of, 
establishment of, 196; vs. nonrelativistic quantum 
field theory, 191-193 
relevant operators, 363 

renormalizable conditions, imposing, 241-242 
renormalizable theory (ies), 169,173,453; electroweak 
theory as, 384; nonabelian gauge theory as, 173, 
411; (p 4 theory as, 173, 175, 178-179; Yukawa 
theory as, 179 

renormalization, 161, 164-166; coupling, 173-174; 
of electric charge, 205; field, 175; mass, 174; wave 
function, 175 

renormalization group, 356, 358; and Anderson 
localization, 366-367; in condensed matter 
physics, 360-363; and effective description, 

367; effective field theory philosophy and, 453; 
in high energy physics, 359-360; in quantum 
chromodynamics, 388-389 
renormalization theory, application of, 240-241 
renormalized coupling constant, 166 
renormalized (dressed) perturbation theory, 175-176, 
176f 

reparametrization invariance, 84 
replica method, 353-354 
representations: conventions for naming, 526; 
multiplying, 530-531 

repulsion: of bosons, 192-193, 283, 337; quantum 
field theory on, 32-33; spin 1 particle and, 36-37; 
of vortices, 338 

rest frames, for photons, absence of, 186-189 
Ricci tensor, 433 

Riemann-Christoffel symbol, 85, 445 
Riemann curvature tensor, 433, 480-481, 515-516 
Riemannian manifolds, differential geometry of, 
443-444 

Rosenbluth, Marshall, 105 

rotation group, 114; and Lorentz group, symmetry 
of, 118 

Rg gauge, 267, 268 

Salam, Abdus, 171; electroweak theory of, 383; 

superspace and superfield formalism of, 462, 463 
scalar boson operator, 113 

scalar field: complex, 65-66; Feynman rules for, 54- 
55, 534-535; quantizing in curved spacetime, 82; 
and vacuum energy, 66 

scalar field theory: classical field equation in, 20; 
Euclidean functional integral and, 287; Euclidean 
version of, 293; massless version of, 284; simplicity 
of, 519-520 
scalar potentials, 245 

scattering of particles: describing, 51-53, 5 If, 52f; 
fermion-fermion, Feynman diagram for, 172; 


meson-meson (see meson-meson scattering 
amplitude); reflection symmetry in, 76; and 
vacuum fluctuations, 124. See also electron 
scattering 

Schouten identity, 492, 493 
Schrieffer, Bob, 297 

Schrodinger equation: electromagnetic gauge 
transformation in, 189; Klein-Gordon equation 
and, 21n, 190; limitations of, 3; Yang-Mills 
structure in, 261 

Schwarz, John, on string theory, 470n 
Schwarzschild black hole, 311 
Schwarzschild solution, for Hawking radiation, 290 
Schwinger, Julian: on complex plane, 208; 
and effective potential, 237; on Feynman’s 
contribution, 43, 50, 56; on magnetic moment 
of electron, 196-198, 454; on path integral 
formalism, 60; at Pocono conference (1948), 105; 
teaching style of, 454; Yang-Mills theory and, 

379 

second-order phase transitions, 292 
seesaw mechanism, 37, 426 
Seiberg, Nathan, 334 
self-dual theory, 337 
semions, 315 
a meson, 341, 342 

a model, 340-341; for ferromagnets and 

antiferromagnets, 345, 346; nonlinear, 342, 346 
sky color, effective field theory of, 457-458 
Slansky, Dick, and seesaw mechanism, 426 
S-matrix theory, 68, 235, 340, 498-501 
solid state physics, Dirac equation in, 298, 299 
solitons (kinks): discovery of, 302-304, 473; 
dynamically generated, 400-405; mass of, 304; 
topological stability of, 304; unifying language for 
discussing, 307 

SO (N). See special orthogonal group 
sources and sinks, creating, 20, 51, 5 If 
spacetime: curved (see curved spacetime); dimension 
of, and symmetry breaking, 229; discretizing, 22; 
Feynman diagrams in, 54, 58, 213; gravitational 
waves in, 479-482; graviton in, 515-517; symmetry 
of, Lorentz invariance as, 76 
special orthogonal group SO(N), 525-527; binary 
code in, 427-428; review of, 531-532; SO(3), 526- 
527; SO(10) grand unification, antineutrino field 
in, 425-426; SO(18), 428; spinor representation 
of, 421-423, 424, 426 

special relativity: antimatter as requirement in, 157; 

and quantum mechanics, marriage of, 3, 6, 121 
special unitary group SU (N), 527-530; decomposing 
representations of, 531; of Heisenberg, 531; SU 
(2), 529-530; SU (3), 529, 530; SU (3), of Gell- 
Mann and Ne’eman, 531; SU (5), 531; SU (5), 
Georgi and Glashow theory of, 407-409 
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spin angular momentum, Dirac equation on, 195 
spinor(s): Dirac, 94, 96, 114, 117; Majorana, 102; 

representations of, 116-117; Weyl, 117, 462 
spinor field: deriving, 125-127; path integral for, 123; 
path integral for, Grassmann numbers in, 124; 
vacuum energy of, 111-112, 125 
spinor helicity formalism, 486-491, 496, 501, 521 
spin-statistics rule, 120-121; and anticommutation 
relations, 122, 123; price of violating, 121-122 
spin wave, 229 

spontaneous symmetry breaking, 224, 225, 227; 
continuous: and massless fields, 228-229; in 
gauge theories, 263-265; in particle physics, 

292, 297, 449; quantum fluctuations and, 229; 
of reflection symmetry, 225; in relativistic vs. 
nonrelativistic theories, 285; second-order phase 
transitions and, 292; and superfluidity, 283-284 
square anomaly, 276, 277f 
square root of momentum, 486-489 
steepest-descent approximation, 16 
Stokes’ theorem, 307 
Stoner, E. C., 120n 

Strathdee, J., superspace and superfield formalism 
of, 462, 463 

stress-energy tensor, 35; definition of, 83; of light 
beam, 445; properties of, 84 
string theory: 2-dimensional field theory, 469- 
470; as candidate for unified theory, 433, 452; 
and cosmological constant problem, inability 
to resolve, 450; duality of, 334; future of, 513; 
graviton in, 515-517; Kaluza-Klein idea and, 442; 
origins of, 6, 387; p-forms in, 251; in quantum 
field theory, 473; Schwarz on, 470n 
strong coupling: fixed point in, 359; linking to 
perturbative weak coupling, 473 
strong interaction: chiral symmetry of, 234; currently 
accepted theory of, 379; fundamental theory 
of, 360; hadronic, 36; at low energies, 340-341; 
nonabelian gauge theory on, 259, 379; quantum 
field theory of, 235, 340; renormalization group 
flow applied to, 368; symmetries of, 234, 387-388 
SU (N). See special unitary group 
supercharges, 464 

superconductivity, 295-297; and Meissner effect, 296 
superconductor(s): monopole confinement in, 
386-387; type II, flux tube in, 307 
superfield, 464-465; chiral, 464, 466; vector, 466-467 
superfluidity, 192; gapless excitations and, 284-285; 
Lagrangian summarizing, 284; linearly dispersing 
mode of, 284; spontaneous symmetry breaking 
and, 283-284 

superspace and superfield formalism, 462-463 
super string theory, 470 
supersymmetric action, 466 


supersymmetric algebra, 462-463 
supersymmetric field theories, 461, 467-468; 

Yang-Mills, 392, 467-468 
supersymmetric method, 355 
supersymmetric transformation, total divergence 
under, 465-466 

supersymmetry, 112; Dirac spinor and, 114; 

inventing, 462; motivations for, 461 
surface growth, 347, 360; and quantum field theory, 
348-349 

Swieca, Jorge, 230n 

symmetry, 76-80; in amplitudes, 78; breaking, 

226; chiral, 234, 387, 388, 419; classical vs. 
quantum, 270-271; conserved current and, 78- 
79; continuous, 77-78, 226; in field theories, 

475; Grassmannian, 355; Heisenberg isospin, 
387, 388; interchange, 77; internal, 77; power 
of, 18, 76, 118; reflection, 76, 226; replica, 353; 
of spacetime, Lorentz invariance as, 76; strong 
interaction, 234, 387-388; tensors and, 526, 528. 
See also supersymmetry 

symmetry breaking, 223-230; continuous symmetry 
and, 226; dimension of spacetime and, 229; 
dynamical, 230, 388; in gauge theories, 263- 
265, 268, 296; and nonanalyticity, 293; quantum 
fluctuations and, 229, 237, 242, 270; in quantum 
mechanics vs. quantum field theory, 225-226; 
reflection symmetry and, 223, 224, 225; and 
superfluidity, 283-284; and vacuum energy, 449. 
See also spontaneous symmetry breaking 

Taylor, T. R., 493 
Teller, Edward, 105 

temperature: black hole, 290; and cyclic imaginary 
time, 289; finite, quantum field theory at, 289-290 
tensor(s): energy-momentum, 319; of light beam, 
445; of orthogonal group, 525-526; Ricci, 433; 
Riemann curvature, 433; stress-energy, 35, 83- 
84; symmetry properties of, 526, 528; of unitary 
group, 527-528; vacuum polarization, 200, 201f, 
204, 208, 209f, 211, 216, 218 
tensor field, 35, 83 
6 term, 259 

Thomas precession, 115 

’t Hooft, Gerardus, 173; on electroweak theory, 

384; on large N expansion, 394; on magnetic 
monopoles, 309 

't Hooft double-line formalism, 258-259 
3-brane, 40-42 

time ordering in canonical formalism, 67-68 
time reversal, 102-104; and Dirac equation, 104 
Tolman, R., 441 

Tolman-Ehrenfest-Podolsky effect, 441, 446 
Tomonaga, Shin-Itiro, 60 
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topological current, 304 
topological field theory, 318 

topological objects, 306; discovery of, shock of, 311. 

See also specific objects 
topological order, 328 

topological quantum fluids, 322. See also Hall fluid 
total divergence, under supersymmetric 
transformation, 465-466 
trace, 526 

tree diagrams, 45, 483-484, 491-494 
Treiman, Sam, 236. See also Goldberger-Treiman 
relation 

triangle anomaly, 271f, 276 
twistor space, 494-495 
Tye, Henry, 513 

ultraviolet catastrophe, 448 
ultraviolet divergence, 162 
uncertainty principle, 3, 290 
unification. See grand unification 
unitarity, 215-216, 500 
unitary gauge, 267 

unitary groups, embedding into orthogonal groups, 
423-424 

universe: 3-brane, 40-42; early, 290; formation of 
structure in, 36 

vacuum: disturbing of, 20, 21f, 70-73; quantum, 20, 
358 

vacuum energy: calculation of, using path integral 
fomalism, 123-125; disturbance of vacuum and, 
70-73; of free scalar field, 66; of free spinor field 
(Dirac field), 111-112, 125; Grassmann path 
integral for, 127; symmetry breaking and, 449 
vacuum expectation value, 226 
vacuum fluctuations, 59-60, 59f; Feynmann diagram 
corresponding to, 129; scattering of particles and, 
123 

vacuum polarization tensor, 200, 201, 204, 208, 209f, 
211,216,218 
van Dam, H., 439 

van der Waerden notation. See dotted and undotted 
notation 

vector field, interacting with Dirac field, 100; 

Feynman rules for, 129, 129f, 535-536 
vector meson (massive spin 1 meson): field theory 
of, 32-33. See also massive spin 1 particle 
vector potential, 245 
vector superfield, 466-467 

Veltman, Tini, 173, 439; on electroweak theory, 384 
vielbeins, 443 

visual perception, application of field theory to, 476 
vortex (vortices), 306, 331-332; as charges in dual 
theory, 332-334; density of, 333; duality of, 334; 


as flux tube, 307; motion in fluid, 338-339, 338f; 
paired with antivortex, 310-311, 339; quasiparticle 
as, 328; repulsion of, 337 

Ward-Takahashi identity, 149, 411 
wave function(s), Anderson localization of, 351, 354 
wave function renormalization, 175 
wave packets, in mattress model, 4—5, 4f 
weak interaction, 37; intermediate vector boson of, 
171-172, 309; and parity, 100, 379-380; quantum 
field theory applied to, 231. See also Fermi theory 
of the weak interaction 
weak interaction Lagrangian, 100 
Weinberg, Steve, 171, 508; electroweak theory of, 383 
Weisskopf phenomenon, 180-181; grand unification 
and, 419 

Wen, Xiao-gang, 324, 328; and topological order, 328 

Wentzel, Gregory, 105 

Wess-Zumino model, 462 

Weyl basis, 98-99,118 

Weyl-Eddington terms, 457 

Weyl spinors, 117; and supersymmetry, 462 

Wheeler, John, 365n 

Wick, Gian Carlo, 14 

Wick contractions, 14-16, 47 

Wick rotation, 12, 287 

Wick theorem, 14 

Wigner, Eugene: on antisymmetric wave function 
of electron, 107; and law of baryon number 
conservation, 413; and random matrix theory, 396; 
on time reversal, 102 
Wigner semicircle law, 397-400 
Wilczek, Frank, 315, 316; on Yang-Mills theory, 386 
Wilson, Ken, 161; and complete theory of critical 
phenomena, 293; and effective field theory 
approach, 452; and lattice gauge theory, 374-376; 
and renormalization groups, 361 
Wilson loop, 261; in lattice gauge theory, 376-377, 
457; and quark confinement, 386 
Witten, Ed, 334, 500 
Wu, Tai-tsun, 248 

Yanagida, T., and seesaw mechanism, 426 
Yang, Chen-Ning, 100, 105, 248; and nonabelian 
gauge theory, 253, 255 

Yang-Mills bosons, 257, 386; self-interaction of, 

434 

Yang-Mills coupling constant, 258-259 
Yang-Mills Lagrangian, 257 
Yang-Mills theory, 257-258; area law in, 377; 
asymptotically free, 386; Einstein-Hilbert action 
compared with, 434-435; Einstein’s theory 
of gravity compared with, 444-445, 513-520; 
Feynman rules in, 257, 257f, 494-495; 
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Yang-Mills theory (continued) 

gluon scattering in, 483-496; original response 
to, 371, 379; perturbative approach to, 374; 
quantizing, 371-373; recent developments 
in, 483-496, 501-504, 513-520; recursion in, 
501-503; Schrodinger equation and, 260- 
261; supersymmetric, 392, 467-468; Wilson 
formulation of, 374—376 
Young tableaux, 526 
Yukawa, H„ 28-29, 171 


Yukawa coupling, 170 
Yukawa theory, renormalizability of, 178— 
179 

Zakharov, V., 439 
Zee, A., 316 

Zhang, Shou-cheng, 324 
Zinn-Justin, Jean, 173 
Zuber, Jean-Bernard, 402 
Zumino, Bruno, 121, 470 



